Optical Character Recognition Guidelines
Please visit these pages to learn more about access and disability resources at Wheelock.
Access and Disability Resources
Wheelock Library Building, Suite 205A
Boston, MA 02215
Director: Jennifer Pike
Confidential Fax: 617-879-2163
OCR is the electronic conversion of images (typed, hand-written, and printed text) into machine-encoded text. OCR can be used when it is necessary to provide printed information to a student that is blind and uses screen reading software such as JAWS or NLVD. However, it is not as simple as just scanning the document and sending it to the student.
Screen reading software does not "read" a scanned document in its original format. A screen reader can read a document when it has first been scanned as a PDF, put through the "OCR" process, and then saved as a Word document. All documents that go through the OCR process need to be carefully proofread! Care in scanning and careful checking is absolutely necessary to ensure exact exchange of information. OCR does not work well when the spine of a book is scanned or when poor-quality photo copies are used. The difference between the screen reader reading content correctly or not is huge—one letter off can change the meaning of a word or content.
The following guidelines should be followed when printed material is needed by students who use screen reader technology.
- Before scanning, check online to see if the printed information is already available somewhere as a PDF. Using an existing PDF makes converting printed material a lot easier because you can avoid the errors that often occur in converting a document through OCR. If you cannot find it online, proceed to the OCR scanner.
- Use the copy machine on 2nd floor of the Wheelock Library since it has OCR capacity (not the machine inside suite 205). You will need a faculty/staff ID to access the printer/scanner.
- Select "Scan to PDF" and then click "Set Details." Select "OCR" and then proceed to scan. (You will need to input your email if you are using someone else's ID). Press down hard to avoid cutting off chunks of text, especially if scanning from a book.
- The scans will be sent to your email as PDFs. Documents over 3 pages long tend to send in separate chunks. You will need to re-save these scans as Word documents. Most of the time, you should be able to simply click "Save as" and then change the file format when the save box pops up.
- If the file does not allow you to easily save it as a Word document, you will need to select "Tools," then "Text Recognition," then "AA." This will convert the file for you.
- Once the files are converted to Word, you will need to read through and edit any errors or awkward formatting that took place during the conversion. This often includes "m" appearing as "nm" or "v" appearing as "Y" and vice versa. You will need to read through very carefully. You should also be able to delete the images of book spines, outlines, etc. so that it is just text.
- If your document was sent in multiple files, you will want to copy and paste into a single Word document.