Unicode Transformation Format Characters (The UTF-8 character sets) Note that your ability to see the characters in these charts depends on two things. The first is UTF-8 support at the operating system level. Currently the only operating systems of which I am aware that deliver with UTF-8 support are MacOS, Unix/Linux, and Windows 2000. The other important thing that is required, given UTF-8 support at the operating system level, is the presence of at least one UTF-8 compliant font. Unfortunately, I know of no fonts which give complete UTF-8 support. The only font with which I have worked directly which comes even close is "Arial Unicode MS" (a Micro$oft creation), in the world of Windows. Arial Unicode, which weighs in at a whopping 27 megabytes, contains some proprietary characters in the "C1 Controls" section of the Latin 1 Supplement, and it also contains incomplete character sets (such as in IPA, Spacing Modifiers, Diacritics, Arabic, Tibetan, etc.), or does not contain any characters at all (such as in Sinhala, Myanmar, Ethiopic, Cherokee, etc.) for many of the character subsets. Presumably there are at least partially UTF-8 compliant fonts available for all operating systems which support it, but, at the moment, I don't know what they are. If your computer doesn't support UTF-8, these code charts will be pretty meaningless to you. At best, you'll see little open rectangles, or question marks in the place of the characters. Even if your computer does support UTF-8, there's a good chance you won't be able to see all of the characters on all of the charts, and, especially if you're using Micro$oft, you will see some characters which aren't the standard ones described. Presumably things will get better as time goes on, and the support is built in more and more commonly. At the moment, it's pretty hit-or-miss. More information on Unicode support and fonts can be found at Roman Czyborra's Index of /unifont, Alan Wood's page of Fonts that support Unicode, The Unicode Sliderule and at Unicode's Font Acknowledgements. Good luck! | |||||
RANGE | Character Subset | ||||
0 - 127 | Basic Latin | ||||
128 - 255 | Latin 1 Supplement (ISO 8859-1) | ||||
256 - 383 | Latin Extended A | ||||
384 - 591 | Latin Extended B | ||||
592 - 687 | International Phonetic Alphabet Extensions | ||||
688 - 767 | Spacing Modifier Characters | ||||
768 - 879 | Combining Diacritical Marks | ||||
880 - 1023 | Greek (ISO 8859-7) | ||||
1024 - 1279 | Cyrillic | ||||
1328 - 1423 | Armenian | ||||
1424 - 1535 | Hebrew (ISO 8859-8) | ||||
1536 - 1791 | Arabic (ISO 8859-6) | ||||
1792 - 1871 | Syriac | ||||
1920 - 1983 | Thaana | ||||
2304 - 2431 | Devanagari (Based on ISCII 1988) | ||||
2432 - 2559 | Bengali (Based on ISCII 1988) | ||||
2560 - 2687 | Gurmukhi (Based on ISCII 1988) | ||||
2688 - 2815 | Gujarati (Based on ISCII 1988) | ||||
2816 - 2943 | Oriya (Based on ISCII 1988) | ||||
2944 - 3071 | Tamil (Based on ISCII 1988) | ||||
3072 - 3199 | Telugu (Based on ISCII 1988) | ||||
3200 - 3327 | Kannada (Based on ISCII 1988) | ||||
3328 - 3455 | Malayalam (Based on ISCII 1988) | ||||
3456 - 3583 | Sinhala | ||||
3584 - 3711 | Thai (Based on TIS 620-2533) | ||||
3712 - 3839 | Lao (Based on TIS 620-2529) | ||||
3840 - 4095 | Tibetan | ||||
4096 - 4255 | Myanmar | ||||
4256 - 4351 | Georgian | ||||
4352 - 4607 | Hangul Jamo | ||||
4608 - 4991 | Ethiopic | ||||
5024 - 5119 | Cherokee | ||||
5120 - 5759 | Unified Canadian Aboriginal Syllabic | ||||
5760 - 5791 | Ogham | ||||
5792 - 5887 | Runic | ||||
6016 - 6143 | Khmer | ||||
6144 - 6319 | Mongolian | ||||
7680 - 7935 | Latin Extended Additional | ||||
7936 - 8191 | Greek Extended | ||||
8192 - 8303 | General Punctuation | ||||
8304 - 8351 | Superscripts & Subscripts | ||||
8352 - 8399 | Currency Symbols | ||||
8400 - 8447 | Combining Marks for Symbols | ||||
8448 - 8527 | Letter-like Symbols | ||||
8528 - 8591 | Number Forms | ||||
8592 - 8703 | Arrows | ||||
8704 - 8959 | Mathematical Operators | ||||
8960 - 9125 | Miscellaneous Technical | ||||
9216 - 9279 | Control Pictures | ||||
9280 - 9311 | Optical Character Recognition | ||||
9312 - 9471 | Enclosed Alphanumerics | ||||
9472 - 9599 | Box Drawing | ||||
9600 - 9631 | Block Elements | ||||
9632 - 9727 | Geometric Shapes | ||||
9728 - 9883 | Miscellaneous Symbols | ||||
9884 - 10175 | Dingbats | ||||
10240 - 10495 | Braille Patterns | ||||
11904 - 12031 | CJK Radicals Supplement | ||||
12032 - 12255 | Kangxi Radicals | ||||
12272 - 12287 | Ideographic Description Characters | ||||
12288 - 12351 | CJK Symbols and Punctuation | ||||
12352 - 12447 | Hiragana | ||||
12448 - 12543 | Katakana | ||||
12544 - 12591 | Bopomofo | ||||
12592 - 12687 | Hangul Compatibility Jamo | ||||
12688 - 12703 | Kanbun | ||||
12704 - 12735 | Bopomofo Extended | ||||
12704 - 12735 | Enclosed CJK Letters and Months | ||||
13056 - 13311 | CJK Compatibility | ||||
13312 - 19903 | CJK Unified Ideographs Extension A | ||||
19968 - 40879 | CJK Unified Ideographs | ||||
40960 - 42127 | Yi Syllables | ||||
42128 - 42191 | Yi Radicals | ||||
44032 - 55215 | Hangul Syllables | ||||
Undefined Character Subsets
| |||||
63744 - 64255 | CJK Compatibility Ideographs | ||||
64256 - 64335 | Alphabetic Presentation Forms | ||||
64336 - 65023 | Arabic Presentation Forms A | ||||
65056 - 65071 | Combining Half-Marks | ||||
65072 - 65103 | CJK Compatibility Forms | ||||
65104 - 65135 | Small Form Variants | ||||
65136 - 65279 | Arabic Presentation Forms B | ||||
65280 - 65519 | Halfwidth and Fullwidth Forms | ||||
65520 - 65535 | Specials |
ॐ नमः शिवाय