Preview

OCR in C#

Powerful Essays
Open Document
Open Document
3173 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
OCR in C#
Document 5
An overview of the Tesseract OCR (optical character recognition) engine, and its possible enhancement for use in Wales in a pre-competitive research stage
Prepared by the
Language Technologies Unit (Canolfan Bedwyr), Bangor University
April 2008

This document was prepared as part of the SALT Cymru project, funded by the
Welsh Assembly Government under the Knowledge Exploitation Fund’s
Knowledge Exchange Programme, reference HE 06 KEP 1002

What is OCR technology?
OCR technology allows the conversion of scanned images of printed text or symbols
(such as a page from a book) into text or information that can be understood or edited using a computer program. The most familiar example is the ability to scan a paper document into a computer where it can then be edited in popular word processors such as Microsoft Word. However, there are many other uses for OCR technology, including as a component of larger systems which require recognition capability, such as the number plate recognition systems, or as tools involved in creating resources for SALT development from print based texts.
Availability
General Availability
Commercial OCR technologies, of which OCR engines is the core component, are widely available. These commercial engines are highly developed and offer considerable accuracy when working with texts from major languages. With English text for example, the top commercial engines have an accuracy of over 98%. Some companies specializing in OCR technologies offer software developer kits (SDKs) which allow software developers to license the use of the OCR technology in their own systems.
Language Availability
As previously mentioned, the accuracy of major-language commercial OCR is very high.
This accuracy is achieved through the combination of language independent algorithms for identifying the likely value of a character with language specific information such as wordlists that improve the results of these algorithms.

You May Also Find These Documents Helpful

  • Satisfactory Essays

    c. From the "Save As Type" dropdown menu, select the "Microsoft Word 97/2000/XP (.doc)" option to save the file as a .doc document.…

    • 305 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    ie word processing, presentation software and email or specialist software. Evidence may come from this or…

    • 3803 Words
    • 16 Pages
    Powerful Essays
  • Good Essays

    If you do not have Microsoft Word at home, there is a free version found here: http://www.openoffice.org/ ... it works very well -- however, no matter what word processing package you use, make sure to save your documents as "rich text format” – (.rtf) NOT .wps or similar...…

    • 490 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    CpEddDocumentView

    • 6008 Words
    • 18 Pages

    Hardware requirements: To access and retain electronic Documents you need a computer or mobile device with access to the…

    • 6008 Words
    • 18 Pages
    Powerful Essays
  • Satisfactory Essays

    the wolf

    • 368 Words
    • 2 Pages

    In the office, instead of the word typewriter, the word processor is now standard equipment…

    • 368 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    A scanner allows me to transfer paper documents onto the computer, such as letters, photograph page of a passport etc. This is something I do on a daily basis as I have to scan on candidate’s…

    • 1920 Words
    • 8 Pages
    Powerful Essays
  • Good Essays

    2463735

    • 594 Words
    • 3 Pages

    Microsoft Office (2007 & 2010) Power Point, Excel Spreadsheet Creation, Internet, Using Email, Scanning, Faxing, Printing, working using a variety of programs/software on a PC, remote desktop/screen sharing, Medical Office Manager ▪ ICD9, CPT, HCPCS…

    • 594 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    using a computer or terminal to key data from source documents to a file stored on a magnetic disk…

    • 656 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    One way governments, large corporations and businesses can do this is with the use of electronic technology.…

    • 913 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    Invoice and Revenue Cycle

    • 1945 Words
    • 8 Pages

    | The revenue cycle is a recurring set of business activities and related information processing operations associated with providing goods and services to customers and collecting cash in payment for those sales. With whom is the primary external exchange of information?…

    • 1945 Words
    • 8 Pages
    Satisfactory Essays
  • Satisfactory Essays

    GBDA 303 Chapter 7

    • 366 Words
    • 4 Pages

    data. Also, Microsoft Word or any document reader that can read HTML has built in…

    • 366 Words
    • 4 Pages
    Satisfactory Essays
  • Good Essays

    |An example where I use Microsoft Word within the office is by creating a fax cover note which can be used to send to customers/suppliers. To | |…

    • 2066 Words
    • 9 Pages
    Good Essays
  • Powerful Essays

    Sreeni Ppt

    • 1390 Words
    • 6 Pages

    This paper presents a neural network based system that is able to analyse the image of a car given by a camera to locate the registration plate, recognise and validate the registration number of the car. A novel image segmentation technique called Sliding Concentric Windows (SCW) is discussed. In the process of license plate recognition, the Region of Interest is detected by SCW segmentation, image masking, binarisation and connected component labeling arranged in sequence. Then, the detected region is segmented and forwarded to a Neural Network based Optical Character Recognition (OCR) engine. A three-layer Probabilistic Neural Network (PNN) of topology 108-180-36 with supervised training is used for the task of character recognition. After recognition, the characters are fed into a character validation module that determines whether the character is sufficiently valid based on the validation threshold provided as a parameter. The limitations such as image blurs, insufficient lighting conditions and arbitrary size of license plates and the scope of the system are also discussed.…

    • 1390 Words
    • 6 Pages
    Powerful Essays
  • Satisfactory Essays

    Ocr Synopsis

    • 376 Words
    • 2 Pages

    The Main objective of “Printed Kannada Character Segmentation” is to Segment the Kannada printed characters written in Text Books, Official Documents, Files, News Papers and other Historical Data which is widely used in the state of Karnataka. Data Entry of the Printed Kannada characters is very difficult as well as time consuming requires more man power to do the task. Thus idea behind our projects is to convert printed Kannada character into editable file very easily by adopting the OCR Mechanism. Character Segmentation is a module which is initial stage of the printed character recognition.…

    • 376 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    ICT REVISION

    • 1860 Words
    • 26 Pages

    Scanner Input Devices Camera Microphone Printer Backups are normally stored on a storage device e.g. DVD, External Hard disk, etc... Dual computers run at the same time. If one computer breaks, the other takes over Output Devices Input / Output / Storage Speaker Digital camera memory cards Dual Computers Storage Backups should be completed once a day if the data is important Hard disk Backups CD / DVD Backups should be stored off site in a safe place Bitmap An image made up of pixels where each pixel has its own state…

    • 1860 Words
    • 26 Pages
    Powerful Essays