Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

Up Previous Next 

PATH 
Mac OS 8 and 9 Developer Documentation > Text Encoding Conversion Manager
Programming With the Text Encoding Conversion Manager



General Character Set Structure

ISO 2022 and ISO 4873 define a structure for coded character sets using 7-bit or 8-bit values. These coded character sets provide a means of representing both graphic characters and control functions; control functions that can be represented with a single code point are also called control characters.

For character sets using 7-bit values, the range 0x00-0x1F is reserved for a set of 32 control characters, designated C0; another set of 32 control functions, designated C1, may be represented with escape sequences. The range 0x20-0x7F (96 code points) is reserved for up to four sets of graphic characters, designated G0-G3 (in some graphic sets, each code point requires two or three 7-bit values). Most Gn sets use only the 94 code points 0x21-0x7E, in which case 0x20 is reserved for SPACE, and 0x7F is reserved for DELETE. ISO 2022 specifies a protocol for

For 8-bit character sets, the C0 set uses 0x00-0x1F, but the C1 set uses 0x80-0x9F. The G0 set uses 0x21-0x7E (with SPACE and DELETE reserved), but the G1, G2, and G3 sets share the range 0xA0-0xFF (96 code points). Figure B-3 shows these differences.

Figure B-3 Comparison of 7-bit and 8-bit character set structures

The G0 set is typically the ISO 646 international reference version (ASCII). The C0 and C1 control functions are typically from ISO 6429, although other control sets can be used.


© 1999 Apple Computer, Inc. – (Last Updated 13 Dec 99)

Up Previous Next