PRZOOM - /newswire/ -
Vienna, VA, United States, 2005/08/01 - Digital Documents, LLC announces the latest release of their Optical Character Recognition (OCR) software application to complement their dDSpeedScan© 5.0 suite of document scanning solutions.
The leader in document scanning and imaging services to clients globally, Digital Documents, LLC, announces the latest release of its Optical Character Recognition (OCR) software application to complement their dDSpeedScan© 5.0 suite of document scanning and indexing software applications and solutions.
Optical Character Recognition refers to a technology that involves "reading" words from a scanned image by translating each character on an image into searchable text.
OCR enables users to search for and retrieve information within a file or page. In addition, when a set of files is indexed, users are able to search for keywords across an entire document library and retrieve each page with exact precision. OCR enables users to execute searches in seconds, searches that once required hours or days to complete.
To provide our customers with the Optimal OCR Accuracy and Layout Retention, the Digital Documents, LLC Technology Team has developed dDOptimaOCR© 3.5, the most advanced Optical Character Recognition technologies and processes in the industry. dDOptimaOCR© utilizes advanced OCR technologies and processes to enable six-sigma level character accuracy, and ensures that the highest quality image possible is presented to the OCR engines for conversion to text.
It is important to note that the quality and condition of a paper document collection are key factors in the successful recognition of characters to create readable text. Therefore, to enhance the quality of each original page, we start by focusing on the scan quality of each image -- removing noise such as borders, speckles, and skews.
In addition, we utilize advanced color filter technologies to remove any page background colors, in conjunction with multi-light image capture technologies to remove any shadows cast by page creases that could impact image quality or recognition accuracy.
Once document scanning and processing are complete, an OCR text layer is added behind each image utilizing our dDOptimaOCR© solution. This solution begins with an additional orientation filter to ensure that the best image is presented to the OCR engines.
Next, the characters in the image are processed utilizing multi-engine OCR voting technologies that rank each character to determine the best text recognition fit. Then once a word is generated, it is filtered through a proprietary lexicon to ensure the highest quality results.
Finally, this text can be processed utilizing sophisticated layout retention technologies to represent the image text layout, providing the best possible text representation for pinpoint search and retrieval accuracy. Once these processes are complete, the Quality Control Team generates an OCR Benchmark Report detailing the accuracy of the OCR process and the quality of the results.
“As the average size of our document scanning and indexing projects continues to increase, we are seeing mixed document collections that contain a greater variety of document condition, quality, fonts and lay-outs” stated James M. Eglin, Jr., Executive Vice President Sales & Marketing in making this announcement. "We wanted to be able to offer our clients a way to utilize the power and flexibility of full-text searching. However, since we realized that there was nothing currently available in the market that could provide the highest levels of OCR accuracy we developed this version of dDOptimaOCR© to respond to this unfulfilled need."
Additional information about dDOptimaOCR©, as well as Digital Documents, LLC's entire suite of document scanning and indexing services, may be found by visiting our website.