Comprehensive List of Characters

Thanks to the advent of Unicode, and to extensive development work by Duxbury Systems, there is a long list of characters supported by DBT. For the sake of this discussion, we divide characters into four groups:

  1. The most common characters needed to produce literary text (generally found on your computer keyboard).
  2. Characters needed to produce text that may not be on your keyboard: accented letters, or non-Roman scripts such as Cyrillic, Arabic, Tamil, or Korean, etc.
  3. Characters needed to produce mathematics or technical material.
  4. Other characters that may be needed for their graphical content.

For virtually all users, creating text is not difficult. Using Windows Region and Language Settings (especially the Keyboard Settings), you can have Windows - and especially Microsoft Word - use your preferences for entering text. It is very easy to enter text, save it as a Microsoft Word file, and then import the completed Word file into Duxbury DBT.

What if you do not want to change your keyboard system but need to enter a small number of unusual characters into your text? You can do this in either Microsoft Word or in DBT.

When handling technical characters, be aware that you may want to use a mathematics editing program such as MathType to help you. Duxbury DBT can import files created by MathType.

We should mention that while Unicode is used throughout the world, DBT (which predates Unicode!) uses its own internal system for enumerating unusual characters. This system is called DUSCI. If you want to enter a special character into Microsoft Word, you need to know the Unicode number. If you want to enter a special character directly into DBT, you need the DUSCI number.

Entering Raw Unicode into Microsoft Word

In Word, you can enter a character that is not on your keyboard by typing in the 4-digit ("hexadecimal") code for the character (its Unicode value). Immediately afterward press Alt+X, and Word replaces the 4 digits with the actual Unicode character. If you repeat the command at the same position, the effect is reversed. So, this command can be used both to learn the Unicode number for a character and to input a Unicode character.

Entering Raw DUSCI into Duxbury DBT

In DBT, to enter a special character, press Ctrl+] (that is, hold down the Ctrl key and press the right bracket key, "]"). In the dialog box which appears, enter the DUSCI 4-digit code. Like Unicode, DUSCI is written in hexadecimal digits. So, if you want to put the Devanagari "AA" character into your document, its DUSCI code is D+C036. Press Ctrl+] for the dialog box and enter C036. To enter this same character into Microsoft Word, you would use the Unicode number: 0906.

Clipboarding Into DBT does not Work for Special Characters

You should never cut and paste special characters into DBT because this action loses the benefit of the character conversion done during file import. If you need to clipboard some material, clipboard it into Microsoft Word, and then import the Word file into DBT.

Chinese - and other Han - Characters

DBT does not display traditional Chinese ideogram characters. Instead DBT displays an appropriate substitute alphabet based on the language: Mandarin, Cantonese, Japanese, or Korean - the "Han" group of language scripts. Normally, the language is selected automatically just by selecting the DBT template for importing the file. At need, the automatic selection can be overridden by forcing the language choice in the Global: Import Options dialog, but in almost all cases, letting the template select the language is what you want.

These are the choices for how Han characters are imported into DBT:

Language DBT Characters
Mandarin (mainland) Pinyin Romanization with accent
marks for the tones
Mandarin (Taiwan) Zhuyin Romanization
Cantonese Romanization with superscript
numbers for the tones
Japanese Unicode U+30xx characters
Korean Unicode U+11xx characters
Korean Hangul Characters

Hangul often compacts 2 or 3 characters into a single symbol. When DBT imports a file, the process is reversed, DBT breaks a single Hangul character into its component parts. In technical terms, all Hangul characters in the range from U+AC00 through U+D7AF are redirected into Hangul Jamo characters in the U+11xx range. DBT uses a mono-spaced font to display those characters. The result can be difficult to read and is certainly jarring to those who are accustomed to reading conventional inkprint Hangul. In this case, it is best to do all editing in Microsoft Word, using DBT as the translation engine and for output.

Arabic and Hebrew Characters

Both Arabic and Hebrew inkprint are written from right to left. DBT displays the inkprint Arabic and Hebrew text from right to left as well. However, the DBT editing cursor does not (as yet) accommodate the right to left flow within a line.

For these scripts, it is best to do all editing in Microsoft Word, using DBT solely as the translation engine and output manager. In this case, if you need to, you can clipboard whole lines from Word into DBT as a way of making changes within a line.

Table of Unicode Ranges Supported by DUSCI Characters

Note for Screen Reader Users: We are aware that lists of characters can be tedious to review. Likewise, screen reader users can experience some unexpected results when scanning the two tables below. These tables list ranges of Unicode characters.

In the first table, the first column identifies the character set, the second column gives the start of the Unicode range for that set, and the third column indicates the use of that character set. The possible uses are: for a language, for mathematics, for symbols, or for the International Phonetic Alphabet. The entries in the second column are also hyperlinks to tables, each showing the full Unicode range for the character set and the DUSCI equivalent for each Unicode character.

Script Unicode and Link Type
Latin U+00xx Language
Latin Extended U+01xx Language
Latin Extended U+02xx Language
Greek U+03xx Language
Cyrillic U+04xx Language
Armenian and Hebrew U+05xx Language
Arabic U+06xx Language
Hindi and Bengali U+09xx Language
Gurmukhi and Gujarati U+0Axx Language
Oriya and Tamil U+0Bxx Language
Telugu and Kannada U+0Cxx Language
Malayalam and Sinhala U+0Dxx Language
Thai and Lao U+0Exx Language
Tibetan U+0Fxx Language
Myanmar and Georgian U+10xx Language
Korean U+11xx Language
Ethiopic U+12xx Language
Ethiopic U+13xx Language
Khmer (Cambodian) U+17xx Language
IPA 1 U+1Dxx IPA
IPA 2 U+1Exx IPA
Misc Symbols U+20xx Symbols
Arrows, etc. U+21xx Math
Math Operators U+22xx Math
Misc Technical U+23xx Math
Box Drawing U+25xx Math
Dingbats U+27xx Symbols
Math Arrows U+29xx Math
Math Operators U+2Axx Math
Japanese U+30xx Language
Table of All Unicode Ranges U+0000-U+FFFF

The second table lists all of the 4-digit Unicode ranges.

The first column gives the start and end of each range. The second column identifies the character set or use of that range. Most of these entries are hyperlinks. Note that these are external hyperlinks (i.e., to pages on the World Wide Web, not to pages in DBT Help).

If the last column is blank, then there is no support for these characters in Duxbury DBT. If the last column is "Word™ import", it means that these characters are supported by conversion into other Duxbury supported characters during import. If the last column is "DUSCI supported" then that Unicode range is in the previous table.

Unicode Range Name and Wikipedia Link Support Level
U+0000-007F Basic Latin DUSCI supported
U+0080-00FF Latin-1 Supplement DUSCI supported
U+0100-017F Latin Extended-A DUSCI supported
U+0180-024F Latin Extended-B DUSCI supported
U+0250-02AF IPA Extensions DUSCI supported
U+02B0-02FF Spacing Modifier Letters DUSCI supported
U+0300-036F Combining Diacritical Marks DUSCI supported
U+0370-03FF Greek and Coptic DUSCI supported
U+0400-04FF Cyrillic DUSCI supported
U+0500-052F Cyrillic Supplement DUSCI supported
U+0530-058F Armenian DUSCI supported
U+0590-05FF Hebrew DUSCI supported
U+0600-06FF Arabic DUSCI supported
U+0700-074F Syriac  
U+0750-077F Arabic Supplement  
U+0780-07BF Thaana  
U+07C0-07FF N'Ko  
U+0800-083F Samaritan  
U+0900-097F Devanagari DUSCI supported
U+0980-09FF Bengali DUSCI supported
U+0A00-0A7F Gurmukhi DUSCI supported
U+0A80-0AFF Gujarati DUSCI supported
U+0B00-0B7F Oriya DUSCI supported
U+0B80-0BFF Tamil DUSCI supported
U+0C00-0C7F Telugu DUSCI supported
U+0C80-0CFF Kannada DUSCI supported
U+0D00-0D7F Malayalam DUSCI supported
U+0D80-0DFF Sinhala DUSCI supported
U+0E00-0E7F Thai DUSCI supported
U+0E80-0EFF Lao DUSCI supported
U+0F00-0FFF Tibetan DUSCI supported
U+1000-109F Myanmar DUSCI supported
U+10A0-10FF Georgian DUSCI supported
U+1100-11FF Hangul Jamo DUSCI supported
U+1200-127F Ethiopic (Ge'ez) DUSCI supported
U+1380-139F Ethiopic Supplement DUSCI supported
U+13A0-13FF Cherokee  
U+1400-167F Unified Canadian Aboriginal Syllabics  
U+1680-169F Ogham  
U+16A0-16FF Runic  
U+1700-171F Tagalog (Baybayin)  
U+1720-173F Hanun?'o  
U+1740-175F Buhid  
U+1760-177F Tagbanwa  
U+1780-17FF Khmer DUSCI supported
U+1800-18AF Mongolian  
U+18B0-18FF Extended Canadian Aboriginal syllabics  
U+1900-194F Limbu Word™ import
U+1950-197F Tai Le  
U+1980-19DF New Tai Lue  
U+19E0-19FF Khmer Symbols  
U+1A00-1A1F Buginese (Lontara)  
U+1A20-1AAF Tai Tham  
U+1B00-1B7F Balinese  
U+1B80-1BBF Sundanese  
U+1C00-1C4F Lepcha  
U+1C50-1C7F Ol Chiki  
U+1CD0-1CFF Vedic Extentions  
U+1D00-1D7F Phonetic Extensions Word™ import
U+1D80-1DBF More Phonetic Extensions Word™ import
U+1E00-1EFF Latin Extended Additional DUSCI supported
U+1F00-1FFF Greek Extended Word™ import
U+2000-206F General Punctuation DUSCI supported
U+2070-209F Superscripts and Subscripts DUSCI supported
U+20A0-20CF Currency Symbols DUSCI supported
U+20D0-20FF Combining Diacritical Marks for Symbols  
U+2100-214F Letterlike Symbols DUSCI supported
U+2150-218F Number Forms DUSCI supported
U+2190-21FF Arrows DUSCI supported
U+2200-22FF Mathematical Operators DUSCI supported
U+2300-23FF Miscellaneous Technical DUSCI supported
U+2400-243F Control Pictures Word™ import
U+2440-245F Optical Character Recognition Word™ import
U+2460-24FF Enclosed Alphanumerics Word™ import
U+2500-257F Box Drawing DUSCI supported
U+2580-259F Block Elements DUSCI supported
U+25A0-25FF Geometric Shapes DUSCI supported
U+2600-26FF Miscellaneous Symbols Word™ import
U+2700-27BF Dingbats DUSCI supported
U+27C0-27EF Miscellaneous Mathematical Symbols-A DUSCI supported
U+27F0-27FF Supplemental Arrows-A DUSCI supported
U+2800-28FF Braille Patterns Word™ import
U+2900-297F Supplemental Arrows-B DUSCI supported
U+2980-29FF Miscellaneous Mathematical Symbols-B DUSCI supported
U+2A00-2AFF Supplemental Mathematical Operators DUSCI supported
U+2B00-2BFF Miscellaneous Symbols and Arrows  
U+2C00-2C5F Glagolitic  
U+2C60-2C5F Latin Extended-C  
U+2C80-2CFF Coptic  
U+2D00-2D2F Georgian Supplement  
U+2D30-2D7F Tifinagh  
U+2D80-2DDF Ethiopic Extended  
U+2DE0-2DFF Cyrillic Extended  
U+2E00-2E7F Supplemental Punctuation  
U+2E80-2EFF CJK Radicals Supplement  
U+2F00-2FDF Kangxi Radicals  
U+2FF0-2FFF Ideographic Description Characters  
U+3000-303F CJK Symbols and Punctuation  
U+3040-309F Hiragana DUSCI supported
U+30A0-30FF Katakana DUSCI supported
U+3100-312F Bopomofo DUSCI supported
U+3130-318F Hangul Compatibility Jamo Word™ import
U+3190-319F Kanbun  
U+31A0-31BF Bopomofo Extended  
U+31C0-31EF CJK Strokes  
U+31F0-31FF Katakana Phonetic Extensions  
U+3200-32FF Enclosed CJK Letters and Months  
U+3300-33FF CJK Compatibility  
U+3400-4DBF CJK Unified Ideographs Extension A Word™ import
U+4DC0-4DFF Yijing Hexagram Symbols  
U+4E00-9FFF CJK Unified Ideographs Word™ import
U+A000-A48F Yi Syllables  
U+A490-A4CF Yi Radicals  
U+A4D0-A4FF Lisu (Fraser alphabet)  
U+A500-A59F Vai  
U+A6A0-A6FF Bamum  
U+A700-A71F Modifier Tone Letters  
U+A720-A7FF Latin Extended-D  
U+A800-A82F Syloti Nagri Word™ import
U+A830-A83F Common Indic Number Forms  
U+A840-A87F Phags-pa  
U+A880-A8DF Saurashtra Word™ import
U+A8E0-A8FF Devanagari Extended  
U+A900-A92F Kayah Li  
U+A930-A95F Rejang  
U+A960-A97F Hangul Extended  
U+A980-A9DF Javanese  
U+AA00-AA5F Cham  
U+AA60-AA7F Myanmar Extended  
U+AA80-AADF Tai Viet  
U+ABC0-ABFF Meitei Mayek  
U+AC00-D7AF Hangul Syllables Word™ import
U+D800-DB7F High Surrogates  
U+DB80-DBFF High Private Use Surrogates  
U+DC00-DFFF Low Surrogates  
U+E000-F8FF Private Use Area  
U+F900-FAFF CJK Compatibility Ideographs  
U+FB00-FB4F Alphabetic Presentation Forms Word™ import
U+FB50-FDFF Arabic Presentation Forms-A Word™ import
U+FE00-FE0F Variation Selectors  
U+FE10-FE1F Vertical Forms  
U+FE20-FE2F Combining Half Marks  
U+FE30-FE4F CJK Compatibility Forms  
U+FE50-FE6F Small Form Variants Word™ import
U+FE70-FEFF Arabic Presentation Forms-B Word™ import
U+FF00-FFEF Halfwidth and Fullwidth Forms Word™ import
U+FFF0-FFFF Specials