- Print
- DarkLight
- PDF
Description
OCR stands for optical character recognition. The OCR module is used to extract and identify text from images.
Application
The OCR module is typically used with the File Upload module to extract data from documents that the end user uploads or sends in an email. For example, it can be used to extract identifying information from an uploaded image of a driver’s license, which can then be used for further processing.
The following section from a Ushur workflow will first ask the end-user to upload an image of a driver’s license via the File Upload Module. The file will be stored in an Ushur variable which will then be used in the OCR module to extract all text from the image, such as name, DOB, DL#, and expiration date.
Note
The output of the OCR module is a string of text that is stored in a variable. A Data Extraction module is needed to find the required key/values from this text.
Settings
To use an OCR Module, configure the following settings:
Label: The label uniquely identifies a module within a Ushur for your reference.
Set: The extracted text will be stored in this variable.
Source: The variable containing the image is placed here.
Document Processing Mode: This adjusts how the exacted text is formatted after being extracted. The default is set to Raw mode.
Raw mode: Outputs raw text extracted by the OCR module without formation.
Column-header mode: Formats the extracted text as JSON.
Note
Use the +/- to add additional extraction rules.