|
Openlaw Document Handling Software
from Oxford Law and Computing |
|
Navigation: Openlaw Concepts and Terminology > OCR Text |
![]() ![]()
|
What is OCR?
Most of the data in an Openlaw Item Card is information about a document entered by a user or a bureau by typing it or picking it from a list. It may be subjective or objective; any data item may be one of a limited range of classification codes - a name or a Yes/No choice - or it may be free-form text such as a user's summary.
Images are not themselves directly searchable. The text in them can be processed by Optical Character Recognition software ("OCR") to extract the words from the images. There is an Openlaw Add-On called OCR System which will OCR images belonging to a Bundle and poke the resulting text straight into the Text box in the Item Card. Alternatively, a bureau will do this for you. There is a separate section on importing text files. This section concerns searching the text.
Searching OCR Text
Many Items in the Demo Case have the OCR’d text already in them. Suppose you want to find all documents with the word “Ashworth” in them. You have various options:
|
Use the Text Search, which allows you to type a word or phrase, to join it with others and, optionally, to search all the other text fields as well |
|
Use the Query Wizard, select Text as the field to search. Select contains as the Operator. Enter Ashworth as the string to search. |
|
Within a Selection, use the binoculars icon on the Text Tab to highlight all occurrences of the word or phrase |
Note that the text in the Demo Case has been edited to remove some of the errors caused by the OCR process. It would be misleading to suggest that thousands of documents can be OCR’d and used for accurate searches without first cleaning up the text. Nevertheless, this is a powerful tool for those cases which warrant it.
See Also: