The Text Encoding Converter now uses the decompositions defined in Unicode 3.2. The changes are limited to characters in Greek, Thai, Gurmukhi, and Arabic/Farsi. This change affects conversion of characters between Unicode and the Mac encodings for these scripts.
MacThai
xD3 = u0E33 for composed Unicode; now maps to u0E33 for decomposed Unicode too, instead of to uF860+u0E4D+u0E32 (old mapping is loosely mapped to xD3)
MacGurmukhi
x91=u0A5C for composed Unicode; now maps to u0A5C for decomposed Unicode too, instead of to uF860+u0A21+u0A3C (old mapping is loosely mapped to x91)
xD5 is now always (composed & decomposed) mapped to xF860+u0A38+u0A3C instead of u0A36, since the latter is in CompostionExclusions.txt (the old mapping is loosely mapped back)
MacGreek
For mapping to decomposed Unicode - all of the decompositions that formerly used u030D now use u0301; the affected characters (and their mappings for composed Unicode) are:
x87=u0385, xC0=u03AC, xCD=u0386, xCE=u0388, xD7=u0389, xD8=u038A, xD9=u038C, xDA=u038E, xDB=u03AD, xDC=u03AE, xDD=u03AF, xDE=u03CC, xDF= u038F, xE0=u03CD, xF1=u03CE, xFD=u0390, xFE=u03B0 (the old mappings are loosely mapped back)
MacArabic (all variants), MacFarsi (both variants)
Table A-1shows the mapping from composed to decomposed Unicode. The items in the table were not previously decomposed.
Char |
Composed |
Decomposed |
|---|---|---|
xC2 |
u0622 |
u0627+u0653 |
xC3 |
u0623 |
u0627+u0654 |
xC4 |
u0624 |
u0648+u0654 |
xC5 |
u0625 |
u0627+u0655 |
xC6 |
u0626 |
u064A+u0654 |
These encodings are now supported by the Text Encoding Converter:
GB 18030
Full support for the new Chinese standard has been added to the TEC, along with new fonts in the system to support the new characters.
DOS encodings for Simba
kTextEncodingDOSGreek
kTextEncodingDOSBalticRim
kTextEncodingDOSLatin2
kTextEncodingDOSTurkish
kTextEncodingDOSIcelandic
kTextEncodingDOSRussian
© 2005 Apple Computer, Inc. All Rights Reserved. (Last updated: 2005-07-07)