Chapter 9: Spell Checker

Introducing the MegaDots Spell Checker

The MegaDots Spell Checker is unique because it can handle conventional typos as well as the mangled text that comes out of optical scanners. It is highly efficient for both blind and for sighted users.

The Spell Checker is part of your copy of MegaDots. You do not need to install a separate "Spell Checker" disk.

Conventional Typos and Optical Scanner "Scannos"

The MegaDots Spell Checker has a dual personality: it can behave like a conventional spell checker or it can reconstruct words mangled by optical scanners. An example of a conventional typo is writing sucess instead of success. An example of a "scanno" is the sequence fi-ol7i instead of the word from.

The MegaDots Spell Checker would suggest that I'ght'n-2 (note to braille reader: the hyphen stands for an underbar character) should be the word lighting. How does it work? The spell checker has a dictionary of over 90,000 words. When the MegaDots Spell Checker finds a word it cannot recognize, then it looks at a separate list of about 200 ways characters are misread by optical scanners. For example, an "r" can be misread as "i-." The program uses that list to try out many possible combinations of characters. In some cases, the software may test close to a thousand different combinations. Each of these combinations is tested to see if it is a legal word in the dictionary. The program uses a variety of techniques to try to ensure that the first suggestion that the program presents is, in fact, the correct word. The important thing to remember is that even though all this work is going on in the background, you only have to press Enter to accept the demangled word. You can clean up your files in record time.

MegaDots also examines the pattern of punctuation usage in each paragraph. If it finds unmatched punctuation, then it is more likely to view the punctuation as potentially misread letters. For example, if it finds (lose with no matching close parenthesis for the open parenthesis, then the program will suggest close as the correct word. But if there is a matching pair of parentheses, the program keeps the word lose. This increases the chances of making correct guesses.

Blind and Sighted Friendly Interfaces

We have been very sensitive to the issue of making a spell checker which works well with voice or braille output. A spell checker usually combines a sample of the text with one word highlighted and some complex screens for choosing spellings. These complex screens make spell checkers frustrating to use with speech output.

The MegaDots Spell Checker has different screen layouts for blind and sighted users. Blind users will find that the line containing the typo is always on screen line 22. Each alternative word is pronounced and then spelled out by the voice synthesizer. To reduce screen clutter, each screen line only offers one choice. The Visit Mode (see below) is especially useful for a blind user. It lets one explore the complete context of a typo.

Visit Mode

But there is more! The MegaDots Spell Checker also has a powerful Visit Mode to aid in correcting the text. In most spell checkers, you are very limited in the ways you can make changes in the text while you are spell checking. You are also very limited in being able to locate the error. With MegaDots, you press "V" and you are dropped into the full MegaDots Editor. You can delete, add, type and run global replace. There are some limitations: you cannot translate, or use any feature of the top menu bar in this mode. To resume spell checking from Visit Mode, just press Escape or F10. Visit Mode is addictive: once you use it, you will find it difficult to use any other spell checker!

Using the Spell Checker

To use the spell checker from the editor, type F10 T R. This selects Run Spell Checker from the Tools Menu. You are asked if the file came from an optical scanner. If you answer yes, then the spell checker will take into account the typical errors that optical scanners make. To understand how the spell checker works, make frequent use of the F1 help screens.

If the file was scanned and it contains forced page breaks, it is a safe assumption that the forced page breaks represent the start of new inkprint pages. MegaDots asks what to do with these forced page break commands. You have a choice of keeping the forced new page commands, deleting them or substituting inkprint page indicators. If you want inkprint page indicators, you are asked to specify the starting inkprint page number. In one stroke the entire file can be marked with inkprint page indicators for textbook format.

When the spell checker locates a word it cannot recognize, it highlights the word and displays a menu. The title line of the menu is the word Misspelled followed by the word in question. It presents a list of possible choices. Here is the list of choices:

Blind User Interface

In the blind user interface, you are first told the misspelled word, and then the first suggested replacement. The program reads the title line of the menu followed by the first choice. To look at all the spell check options, press the up or down arrow keys. To find out the context for the misspelled word, press V for visit mode. In visit mode you can read the word, sentence, or paragraph containing the misspelled word. To leave visit mode (and go back to the spell checker menu) press escape.

Questions About the Spell Checker

How do I spell check a braille file?

To spell check a braille document, translate into braille (F5), then press F10 T R.

How do I leave the spell checker?

As from anywhere in MegaDots, press escape to stop the spell checker.

How do I revise the dictionary? The R command does not work!

To revise the dictionary, you need to press the letter R followed by the letter A (for Revise Accept). If you just press R, you do not revise the dictionary. We did this to reduce the chances that the dictionary would be revised by mistake.

Sometimes when I revise the dictionary, I am asked if I want to add it in lower case. What does this mean?

Answering yes to this prompt puts both the upper and lower case versions of your word into the dictionary. For example, if a word is a proper name, you would only want the word in upper case, so you should answer no. If the word is not a proper name and might appear in lower case, then answer yes.

What does Abnormal Paragraph Break Mean?

An abnormal paragraph break is a spot where it appears that a new paragraph was started in the middle of a text paragraph. MegaDots thinks there is an abnormal paragraph break if the previous paragraph does not end with a period, question mark or exclamation point, and the next paragraph starts with a lower case letter. When you optically scan text, the optical scanning software guesses where paragraphs should start. When you import a file such as Microsoft Word or ASCII Line, the file and not MegaDots determines where paragraphs should start.

To change the carriage return to a space, press Enter to select the first alternative (which is to replace the return with a space). To leave the return as it is, press A for Accept.

What is Abnormal Punctuation? I get a prompt about Abnormal Punctuation, but I cannot see any mistake.

There are many things which MegaDots thinks of as abnormal punctuation. It could be an open parenthesis without a closing parenthesis, or an open quote without a close quote. Sometimes an abnormal paragraph break (see above) separates open from closing punctuation. In this case, the solution is to replace the extra return with a space. Just press A to accept the abnormal punctuation, and then take out the extra return when you're asked to deal with the Abnormal Paragraph Break. Or you could go into Visit Mode and directly change the return to a space.

What is is a repeated word?

MegaDots notices when a word is repeated in the text. To leave the text alone, press A. To delete the extra word, press D.

How can I get the MegaDots Spell Checker to accept a word throughout my document?

When you press A for accept, it only accepts that one use of the word. Press G followed by A (for Global Accept) to accept a word throughout your document.

How do I spell check just a section of text?

The MegaDots Spell Checker can be used on just a portion of text. Mark a block before starting the spell checker. The spell checker will restrict itself to the highlighted block.

I just put dozens of words into my MegaDots dictionary. How can I copy my revised dictionary to another computer? How can I edit this file?

The revised dictionary is in an ASCII textfile called LEX.AUX stored in the MegaDots directory. You can copy this file to a floppy disk and copy it onto another computer. You can edit the file as well. It is an ASCII file with one carriage return after each entry.

Is there any way to speed up the task of cleaning up optically scanned files?

The Find and Replace function of MegaDots can speed up your work. For example, you can change all carriage returns followed by lower case letters into spaces in one stroke (press Control-F9 for complex replace; use {Eg}s for the find string, and " " for the replacement string). If you see a pattern in your document that you can exploit, make use of find and replace. If you have many documents to fix, consider setting up a rules file to contain all the changes you need to do. See Chapter 12, Find and Replace, for more detailed information on complex replace and rules files.

Obtaining Text From Your Optical Scanner

When you create a file from your optical scanning system, you usually have a choice of file formats. We recommend avoiding WordPerfect files from the OsCaR system (OsCaR puts in far too many font changes and character type changes to be meaningful). Instead use "ASCII no CR" (this is what MegaDots calls ASCII Line). We recommend asking for WordPerfect files from Open Book or Kurzweil. We do not have good information about Recognita. You will have to experiment to find the file format that gives you the best format and character information for that system.

If you are unsure of the file extension or the name of the directory used by your optical scanning software to export files, please contact your optical scanning vendor. Or you can do an experiment. In your scanning program, create a test file called SCANTEST. Now search your hard drive for a file called SCANTEST. Let's say the file SCANTEST.DOC was located in the directory called export. You can launch MegaDots with the command mega c:\export\scantes.doc <Enter>. Usually the file extension is TXT for ASCII textfiles, and DOC for WordPerfect files.

All the major optical scanning vendors are in the process of modifying their software to run the various braille translators from their menu systems. Avoid using these schemes. At present, no optical scanning vendor has shown an interest in launching the full MegaDots Editor from their systems. Without the full MegaDots Editor, you cannot run the MegaDots Spell Checker.

What we recommend is that you save your optically scanned document as ASCII text or WordPerfect, then exit your scanning software and launch MegaDots from the MSDOS prompt or from a MegaDots icon in Windows. Import the newly created file into MegaDots. Then, from the MegaDots Editor, run the MegaDots Spell Checker.

For more information about scanned document cleanup, please see Chapter 7, File Import and Export. The file importer has automatic scanner cleanup available, which is done before you even begin to work on your document in MegaDots.

Using the MegaDots Spell Checker

Recently, I took a 6 page article from a magazine and turned it into braille. Here are the steps I took. I scanned the article using my copy of Open Book Unbound. After each page was scanned and converted, I pressed the Escape key followed by the keypad Ins key (the Scan key) to scan the next page. When the scanning was finished, I used the Library Export command to convert the file into a WordPerfect file in the export directory.

Then I quit Open Book and launched MegaDots. I typed mega \export\complay.doc <enter> to import the article. From the MegaDots Editor I typed F10 T R to run the MegaDots Spell Checker. I was asked Is this text coming from an optical scanner? I pressed Y for yes. Then I was asked, What do you want to do with the forced page breaks? I chose inkprint page indicators and gave the starting inkprint page number.

I did my fix-ups in several passes. In the first pass, I did not look at the inkprint copy at all. I just looked at the screen and the suggestions from the spell checker. If there was something I was not sure about, I just accepted it (pressed A), so I could deal with it later.

Then I did a second pass with the inkprint in front of me. The magazine title was in such large print that it was not scanned. I typed it in manually. On every print page there was a quote from the main body text which was out of context. I deleted these. There was extra text from headers and footers. I deleted these.

I mainly concentrated on the start of paragraphs. A few paragraphs were broken up or combined. I fixed them by deleting or inserting carriage returns. I used the Alt-down arrow key to take me to the top of the next paragraph. I made sure that the top of each paragraph matched up with the paragraphs in the inkprint.

The trickiest problem concerned a side bar. The article had a three column format. One page had a half page sidebar that was in a four column format. Needless to say, Open Book Unbound did not properly arrange the seven columns on the page. So I moved the sidebar to the end of the article. I needed to do some careful clipboarding to collect the sidebar away from the rest of the article.

My final pass through the article was with the inkprint in front of me. I used the spell checker and fixed the spelling errors which I passed over before. I made sure that names were spelled right. I was finished.