Chapter 7: File Import

This chapter discusses the MegaDots file importer, as well as the MegaDots document export facilities.

Importing Microsoft Word Documents

If you have difficulty importing a Word document, see if you can obtain the same file saved as a "Word 6.0/95" file type. This needs to be done from within Word. This older file type may be easier for MegaDots to digest. Be careful you do not overwrite your modern Word file with the older style Word file (since they both use the same file extension).

Microsoft Word uses autonumbered paragraphs. When these are imported into MegaDots, you often get the digit 1 at the start of each paragraph. While there are a number of free solutions, such as saving the Word file to RTF, or pasting the material on a "text only" basis, these have their limitations. The best solution is a program called list fixer, which costs $29.95. See the web site http://www.editorium.com/ listfixer.htm.

What is File Import?

File Import is used to read documents created by other applications, thus converting them into MegaDots documents. The process of file importing preserves the words from the initial document. It also preserves the character attributes. In the case of ASCII files, the file importer figures out what is a hard return and what is a soft return (a hard return starts a paragraph, a soft return is just the end of a line within a paragraph). The file importer also figures out the function of each paragraph as it relates to the entire document. It does this by applying a style to each paragraph.

As we will see, there are many steps to the file import process. As a user of MegaDots, you can either let the importer do its best job, or you can control how the document is to be imported to get the best possible results.

Although MegaDots can read many different file styles, it does not recognize everything. We've worked hard, however, to give MegaDots the power to read most file formats. The last section of your MegaDots Command Summary lists what file imports and exports MegaDots can do. If you have files that MegaDots cannot read, call MegaDots technical support. We will see what alternatives you have.

What is the File Name?

One of the biggest problems encountered while importing files is knowing the correct file name. Or you might know the name but not the location of the file.

For example, you have a WordPerfect document called LETTER.DOC. If you need to know which directory it is in, go to the DOS prompt and type WHERE LETTER.DOC <Enter>. The program may respond with C:\WPDATA\LETTER.DOC. This tells you that the file is located in the directory C:\WPDATA.

If you need to know what is on a floppy disk, then type DIR A: <Enter>. For more information about basic DOS and file names, see Supplement 1.

In MegaDots, press the F3 key to open a document. Type in the full name C:\WPDATA\LETTER.DOC <Enter>. If you want to choose the document from a list, type just the directory name C:\WPDATA and then press F2. Use the arrow keys or type in the first few letters of the document name until you land on LETTER.DOC, and press Enter.) Once you press Enter, MegaDots automatically figures out the document type, and imports it into MegaDots.

The document name showing on the status line has the extension .MEG, because you are now working with a copy of your original. Saving your document with the F4 key does not erase your original document.

If your imported documents tend to come from a certain directory, you can set it under Preferences - Default directories - Word processor documents directory. This makes importing more convenient - you need only name the full path when it is different from the norm.

Another way to search your hard drive or disks is to use the Existing Files List. Press F3 from the Editor, then F2. Any item with a backslash after it is a directory; move to that directory by pressing Enter on the name. To move up a directory level, press Enter on PREVIOUS DIR\. This list is also useful for renaming files (F3), deleting files (Del), copying files (F4) or loading multiple documents (space).

Finding Files to Import

A fairly common problem is locating the files you wish to import. To import a file, press F3. If you know that the file you are looking for ends in .DOC, type *.DOC and press F2. Pick the first choice in the list (previous dir). You can now explore the whole range of directories in your computer.

If you do not know the name of the directory you are looking for, that is a more serious problem. Lets say you are looking for a file called MUSIC.DOC. Press Alt-F10 to get to the DOS prompt and type WHERE MUSIC.DOC <Enter>. When you want to get back into MegaDots, type EXIT <Enter>.

Checking and Correcting Style Mistakes

It is easy to see where MegaDots has tagged styles incorrectly. For sighted users, simply page through the document and look for inconsistencies. Normal paragraphs (Body text) should be indented. List items should be flush left, with runovers outdented. Top level headings are centered. Lower level headings are to the left, with a blank line above and below. If you're not sure what style a paragraph is, check the status line. Speech users should check styles in show markup mode, so that each style change is spoken, and use Alt-Down arrow to move by paragraph. Refreshable braille users should check styles in Show Styles Markup Mode (Control-Z M S). Read Chapter 4 for examples and techniques for checking and correcting styles.

Make sure that any special formats are tagged with correct styles. These include tables, outlines, poetry, computer text, the table of contents, index, glossary and bibliography.

Simple documents do not take much work. If you are formatting a letter, the "from address" should be in the Letterhead style, followed by the "to address" in Left flush style. The greeting should also be in Left flush, with an extra blank line before it. Finally, the closing should also be done in Letterhead. If all you are doing is a simple document, you are ready to braille once the styles are all tagged.

For properly formatted complex documents, it is necessary to do much more, such as creating the preliminary pages, highlighting glossary words, fixing the inkprint page indicators, proofreading the braille, etc.

What the New Document Importer Does

Your original document goes through two stages before it becomes a MegaDots document.

Stage One

In the first stage, the document is "cleaned". Superfluous items not useful for making braille are removed. If the text is optically scanned, further cleanup occurs. There are ways to switch most of these features off, either in the user-friendly Interpret Format screen (later in this chapter), or with quick command switches, shown below.

Here is the list of items removed:

Advanced users: to keep all blank lines in your document, add a space and -same after you type the document's name on the command line or in the F3 screen. Use this command if you need the exact layout of the source document. For example, if the file represents a mailing list and you need to preserve the precise placement of blank lines. Advanced users: the only equivalent to these in braille are the Begin Box Line and End Box Line found in the MegaDots Control-Insert L menu. MegaDots draws no parallels between boxes and lines in inkprint documents and the Begin and End Box Line commands. You can keep boxes and lines that were created with typed symbols *-=|, etc. To keep them, add a space and -vis after you type the document's name on the command line or in the F3 screen. From a practical point of view, in terms of braille production, you should avoid using this command. Advanced users: To keep all emphasis in your document, add a space and -excess after you type the document's name on the command line or in the F3 screen. The emphasis that MegaDots strips out needs to be stripped out to make good braille. If you are not making braille, or you think MegaDots is removing important emphasis, you might want to preserve all emphasis. Advanced users: tabs that are used for tables are always kept in the document, unless all table styles are disallowed. To do this, use the -notab option. Use this command if you have been importing documents with items which are not tables but are being mistaken for tables by MegaDots. Advanced users: To keep the extra spaces, you must tell MegaDots that this is not a scanned document by adding a space and the -noscan option after you type the document's name on the command line or in the F3 screen. Advanced users: to keep MegaDots from removing what it thinks is garbled text or graphics, add a space and -garble after you type the document's name on the command line or in the F3 screen. Use this command if you fear that MegaDots is accidentally deleting meaningful data. Advanced users: to keep MegaDots from combining paragraphs, use either the -noscan option (no scanner cleanup), or the -scant option (text only scanner cleanup - see next item). Advanced users: To prevent MegaDots from using this cleanup feature, use either the -noscan option (no scanner cleanup), or the -scanf option (format only scanner cleanup - see previous item). Advanced users: to keep these in your document, use the -runner command. Use this command if you feel that useful text is being stripped off by MegaDots. Advanced users: you cannot keep the hard page breaks unless you use the -same option (keeping the entire original format). You can remove the print page indicators by using the -noppi option.

Stage Two

In the second stage of importation, MegaDots does its best to assign appropriate styles for each section of the document. This version of MegaDots is designed to use very advanced guessing techniques to do the best job possible. In addition, MegaDots will attempt to use any similar style names listed in the original document.

Unfortunately, because MegaDots is not human and cannot understand context, this style automation sometimes makes mistakes. Without appropriate use of styles, the braille and print format of a document will suffer. This is where you come in. To insure the best format for your document, you must check through the document and correct any style mistakes made by MegaDots.

How can I import a computer program?

If you have a file that is a computer program listing, you need to tell MegaDots to use the CBC (Computer Braille Code) translator for the entire document. Type -com after the file name when you import the file. All your blank lines will be preserved, and each paragraph will be marked as Computer style.

A Failsafe Method for Making Readable Braille

MegaDots has a feature which allows a document to be imported without styles. This generates braille with a readable format roughly similar to that of the original inkprint. Indents, runovers and the number of blank lines are slightly compressed. Braille created in this manner is just fine for personal use, but not for publication or correspondence. To use this feature, just add space -unsty after the document name when importing it.

If MegaDots Cannot Recognize the Document Type

It is possible for the automatic document recognition system in MegaDots to not work properly. If MegaDots does not recognize the correct format or won't load a document it should be able to, then you can manually set the type of file you are importing. Add space -? after the filename when importing. MegaDots will let you choose the file type from the list of available import file types (see the MegaDots Command Summary).

If you don't know the file type, or can't find it on the list, try picking Unknown File With Text. This is a general file type to use as a last chance. It works with a variety of file types, such as Pagemaker, some types of help screens, even computer program .EXE's. Basically, it strips out all of the garbage codes and keeps only the text. Its success may vary greatly from file to file, but can be a useful last resort for getting the text out of a file.

What Kind of File Did I Import into MegaDots?

What if you forget what kind of document was imported to create your MegaDots file? Once importation is over, press Alt-I. The file type of the original is listed at the top of the screen under "Source:". Press Escape to return to the Editor.

Advanced Use of the Document Importer

The road to automatic braille formatting has been described. There are more automatic features described below. Novice users may want to continue reading, at least to get an idea of what's possible.

Fine-Tuning with the Interpret Format Screen

A major new feature in MegaDots is the Interpret Format Screen. To use it, press Alt-I. This screen contains all the important information that the importer gathered about your document. The first section of the screen shows general information. The second section shows the heading levels used, and how many of each are in the document. The last section of the screen shows a list of the most common MegaDots styles, along with the number of times each occurs. The Alt-Up and Alt-Down arrow keys jump quickly between these major sections.

This screen is an excellent way to quickly check which styles MegaDots chose for your document. However, the real power of the Interpret Format Screen is that you can change the general options, disallow specific heading levels or styles, and re-import with those settings. You can even make style and/or hierarchy changes to specific paragraphs, or ask that all instances of one style be replaced with another.

Once you press F10, the document is re-imported and all of your requests are instituted. At this point, you may accept the re-imported document, or continue fine-tuning it. What follows is a very general breakdown of this screen. For detailed information on each field, use the F1 Help key on that field within MegaDots.

Except for the first three fields, almost every other field tells you whether or not something exists in your document, and how many. These fields give you the choice to Allow or Disallow that item (use the A or D key on that field). The three main exceptions are at the top of the screen in the section labeled "Document Information", which shows general information about your document.

Here's a little info on these three Document Information fields:

Source

If the document's text has not been changed since it was imported, "Source" shows the original document's type, such as WordPerfect 5.1. If the document was created within MegaDots, or if the text has been changed within MegaDots, this field reads "Source: MegaDots document". In this case, changing options within the top "Document Information" section of this screen may not yield satisfactory results. However, style changes in this screen will always keep.

If "Source" is not MegaDots, you can switch back and forth between the Editor and this screen as much as you like, and as long as you make no changes to the actual text (style and hierarchy changes are always okay) you can still come back to this screen and press F10 to re-import the original document.

Style selection

This field shows the general method MegaDots used to select styles for the current document. The following varieties exist:

Optical scanner cleanup

MegaDots tries to automatically determine if your document is scanned, but it can be mistaken. This field shows any type of special scanned document cleanup MegaDots performed:

If your document contains technical material or words in other languages, select "Format only" or "None", because the automatic text cleaner can incorrectly change meaningful words it doesn't know. Fortunately, you can see what words MegaDots changed by going into Show All Markup mode (Control-Z M A). Anywhere you see a Scan Clean symbol (a greater or equal sign), a word has been changed. You can search for these by pressing Control-F9 Sc <Enter>. Add a double backslash after this search to remove them. Removal is not necessary - the marks do not affect translation.

Other Document Information fields

The next nine fields in this screen are: Tabs, Print page indicators, Visual boxes and lines, Running headers & footers, Garbled text, Excess emphasis, Boldface emphasis, Italic emphasis and Underline emphasis. These fields show whether the special item they indicate existed in the original, and whether MegaDots allowed them to stay in the document, or "disallowed" them, removing them completely.

For example, if the field "Tabs" reads "None", there were no tabs in the original document. If it reads "34", then MegaDots kept some of the tabs. If it reads "Disallow 34", MegaDots has removed them. If you want to re-import the document, and remove all tabs, change this field from "34" to "Disallow 34" by pressing D on the Tabs field and then F10. The new document will contain neither tables nor tabs. Again, press the F1 Help key on any of these fields for more information.

Remember that once a document's text has been changed in MegaDots, Interpret Format will not re-import from the original document. In this case, changing these options may not have the desired results.

One special case is the Print page indicators field. If they are Allowed, it will also list the range of print pages the indicators cover. Pressing F2 on this field allows you to see and change them. This is also available via the Alt-Z feature in the Editor.

Style fields

The rest of the screen shows style names and the number of paragraphs in the document marked with each. For example, "Outline: 16" indicates there are 16 Outline items (paragraphs in MegaDots parlance) within the document.

In any style field, you can press D to Disallow that style. MegaDots will give you a list of choices for what to change the style to. You can choose another specific style, such as List. In the previous example, this will change all 16 Outline paragraphs to List paragraphs. You can also select "Best MegaDots Guess" from the list. If you select this, MegaDots will decide what to do with each paragraph marked Outline on an item by item basis. These collective Interpret Format style changes can be done immediately after import, or any time during the editing process.

To have MegaDots interpret Unstyled paragraphs in your document and tag them with normal styles, Disallow the Unstyled styles, while selecting "Best MegaDots Guess". In this way, you can import a document without styles, clean it up, and then have MegaDots perform its automatic style tagging. This may be useful when working with optically scanned documents. MegaDots might do a better job tagging styles after the text is cleaned up.

To see a list of all Outline paragraphs, first move to the Outline field, either by arrowing or pressing F9 and typing "out" <Enter>. Once you are on the field, press F2. This list is very similar to the Editor's Control-J G document paragraph list (press F2 on the "Total paras" field for that).

Outline items that are together in the document are shown grouped together by large brackets on the left. To see the Outline items in the context of all paragraphs, press the space bar. To toggle back to only outlines, press space bar again. Shift-Number limits the display to items of a specific hierarchy level. Shift-0 returns it to displaying to all hierarchy levels.

Style commands work in these lists, as do hierarchy commands. For example, to change one of the items to a Contents +2 item, type Alt-S "co" <Enter> Alt-2. These individual style and level changes can be done any time during import or editing.

If you want to jump to one of these paragraphs in the Editor, press Enter. Don't worry, your changes to the Interpret Format screen will still be there if you return to it by pressing Alt-I. Moving around your document this way is quite useful. For example, you can use this method to find the third list in a document. Also, typing letters in these paragraph lists performs an incremental search and locates any items beginning with what you type. You can also press F9 to do a general search for items (F9 works within any list).

Press F10 to accept style and hierarchy changes in this list and move back to the Interpret Format screen, or Escape to cancel any changes.

Setting Importer Preferences

All importer preferences exist under Preferences - File import. From this screen it is possible to set options for general documents, or documents with specific file extensions (for example, all .TXT documents).

To set the default options, select "Default". You can also make new settings for a specific file type, such as WordPerfect documents. To do this, type <Insert>, followed by the file extensions, (such as .WP .WP5 .WP6), followed by <Enter>. The import options screen is nearly identical to the Interpret Format screen. Styles, heading levels and import items can be Allowed or Disallowed. Source, Style selection and Optical scanner cleanup can be set.

There are two new options not in the Interpret Format screen: Auto style sheet and Auto report. Set "Auto style sheet? Yes" if you want MegaDots to decide what style sheet to use on a document by document basis. For North American users, this allows MegaDots to select Literary if there are no inkprint page indicators, and Textbk if there are. Set "Auto style sheet? No" if you always want to use the style sheet listed under Preferences - New document - Style sheet. Note to British users: all documents are usually in BRITISH style sheet.

Make "Auto report? Yes" if you want the Interpret Format screen to automatically come up each time you import a document of this type. Even if you set it "No", Alt-I will still bring up Interpret Format manually.

Command Line Options

For each option in the File Import Preferences, there is an equivalent command line option to turn the option on when reading a particular document. Command line options are for advanced users who want complete control from the start. There can be more than one option listed after the document name, each one consists of space, hyphen, option name. For example, to import a document as a letter and get a report, type mega filename -letter -report from the DOS prompt. These command line options also work with the F3 command inside MegaDots. Note that most command line options are the first three letters of the corresponding option in MegaDots. Many of them can be turned off by preceding those three letters with "no".

Here is the list of command line options:

Another trick is to use the File Import preferences for one file extension on a document with a different file extension. This command line option is hyphen, period, plus the file extension listed under Preferences - File Import. For example, mega PROGRAM.J -.C will import PROGRAM.J with the same preferences as are used with .C files. You may want to create a set of preferences for a made up file extension, like .SIM for simple documents. This way, you can always specify -.SIM when you want to use those preferences.

Work in Another Word Processor and Get Exactly What You Want

MegaDots has a feature which allows a document to be imported without styles, so that the format is exactly the same as the original inkprint. Indents, runovers and blank lines remain exactly as they are in the original document. You may wish to do all your braille formatting right within your word processor, and then have MegaDots merely translate it. The use of this feature can be specified in two ways. Choose "Spacing same" under the "Style selection:" prompt, or just add space -same after the document name when importing it.

Another way of getting exactly what you want while working in another word processor is by using MegaDots markup. Anything beginning with #[ and ending with ]# is interpreted as normal MegaDots markup. For example, #[TO]#bueno#[\TO]# marks the word "bueno" in grade one. Play with MegaDots in show markup to see all the MegaDots markup codes.

MegaDots' styles can be specified from any document using pound sign codes at the start of a paragraph. For most styles, just use a pound sign followed by the first three letters of the style name. If there is a hierarchy level, add the number directly after the code, without a space. For headings, use #h followed by the heading level (1-6). When writing a braille file to be importer, such as on a Braille 'n Speak, use two pound signs before the style command.

The following common styles have short abbreviations:

The following two word styles need special codes to avoid overlap:

Issues for Importing Special Documents

HTML Importation

MegaDots should import and handle HTML files from the World Wide Web. In MegaDots, to create a link in a MegaDots document, highlight the text and type control-F J. When you export to HTML, you will find the link you created. You can create web sites with MegaDots.

You can modify how MegaDots imports HTML files by modifying a text file called HTML.MSG. See Chapter 14 for more details on how to do this.

Normally MegaDots cleans up an HTML file for braille during importation. This often means deleting HTML commands that the importer (i.e. not listed in HTML.MSG) does not know about. However, it is possible to see import more exact version of an HTML file with all the commands still there. Set Preferences - File import - .htm - Excess emphasis to Allow. Any HTML commands not converted into MegaDots commands will be visible in show markup mode inside hidden markup (Eh).

File Formats Used in the Blindness and Transcriber Community

Braille Ready Files

Braille ready files (PokaDot and .BRF files) are files directly encoded for braille output. The idea is to just directly copy the file to an embosser to get braille output.

There are two way of importing braille ready files into MegaDots: interpreted and non-interpreted. An interpreted file means that MegaDots attempts to build a natural MegaDots file, which can be back translated, edited, re-translated, and expressed in a variety of translation and format modes.

A non-interpreted file is one that rigidly contains the layout of the source file. There is no opportunity to make more than a microscopic change to these files. The idea is to import a file expressly to drive an embosser with these exact characters.

To import a braille ready file in interpreted mode, just import it.

To import a braille ready file in non-interpreted mode, use -spc -? at the command line. For file format, choose "Spacing same braille". Also set the style sheet to NONUMS to suppress page numbering. If you do this, the file can be brailled, but it cannot be edited without messing up the format. Any editing throws off all page numbers after that point.

Microbraille files need to be converted into standard braille ready files before they can be imported as "Spacing Same Braille". See microcon in the F12 Reference Manual for the details.

Macintosh Document File Conversions

Because MegaDots does not run on the Macintosh, you must copy your Mac document to your PC. If you have a computer network with Macintoshes and PC's, moving a Mac file to a PC is very simple. Otherwise, you must use Apple File Exchange:

To export a document to the Mac, reverse this sequence. You must also tell the Macintosh what kind of file it is. For example, let's say you want a document called LETTER to be a Word document. To do this, enter Word directly by clicking on the program icon. Once the program is launched, open the LETTER document, and then re-save it.

Apple II File Conversions

See Supplement 3.