Legacy Document
Important:
The information in this document is obsolete and should not be used for new development.
Text Encoding Format
A text encoding format specifies a way of formatting or algorithmically transforming a particular base encoding. For example, the UTF-7 format is the Unicode standard formatted for transmission through channels that can handle only 7-bit values. Other text encoding formats for Unicode include UTF-8 and 16-bit or 32-bit formats. These transformations are not viewed as different base encodings. Rather, they are different formats for representing the same base encoding.
Similar to text encoding variant values, text encoding format values are specific to a particular text encoding base value or to a small set of text encoding base values. A text encoding format is defined by the TextEncodingFormat data type.
typedef UInt32 TextEncodingFormat;
The function GetTextEncodingFormat returns the text encoding format of a text encoding specification.
The following enumeration defines constants for specifying text encoding formats:
enum {
/* Default TextEncodingFormat for any TextEncodingBase */
kTextEncodingDefaultFormat = 0,
/* Formats for Unicode encodings */
kUnicode16BitFormat = 0,
kUnicodeUTF7Format = 1,
kUnicodeUTF8Format = 2,
kUnicode32BitFormat = 3
};
Constant descriptions
-
kTextEncodingDefaultFormat
-
The standard default format for any base encoding.
For Unicode and ISO10646
-
kUnicode16BitFormat
-
The 16-bit character encoding format specified by the Unicode standard, equivalent to the UCS-2 format for ISO 10646. This includes support for the UTF-16 method of including non-BMP characters in a stream of 16-bit values.
-
kUnicodeUTF7Format
-
The Unicode transformation format in which characters encodings are represented by a sequence of 7-bit values. This format cannot be handled by the Unicode Converter, only by the Text Encoding Converter.
-
kUnicodeUTF8Format
-
The Unicode transformation format in which characters are represented by a sequence of 8-bit values.
-
kUnicode32BitFormat
-
The UCS-4 32-bit format defined for ISO 10646. This format is not currently supported.
© 1999 Apple Computer, Inc. (Last Updated 13 Dec 99)