< Previous Page

Hide TOC

Glossary

character
An atomic unit of content for text data. A character is an abstract entity without any particular appearance; characters include letters, digits, punctuation, and symbols.

character encoding scheme
A text encoding that maps a sequence of characters (from one or more coded character sets) to a sequence of bytes, in order to combine characters from multiple coded character sets or to permit easier handling of some coded character sets. Compare coded character set.

code fragment
See fragment.

Code Fragment Manager (CFM)
In Mac OS 9 and earlier, refers to the part of the Macintosh system software that prepares fragments for execution.

code point
An integer value that represents (or can represent) a character.

coded character
A character together with its numeric representation in a particular coded character set.

coded character set
A text encoding that maps each character in a set of characters to a particular integer from a set of integers. Compare character encoding scheme.

code-switching scheme
A character encoding scheme that allows switching between different coded character sets, usually signaled by escape or other special sequences. See also character coding scheme.

content transfer encoding
See transfer encoding syntax.

converter object
An instance of data that tells a text converter how to convert text from a particular source encoding to a particular destination encoding, and maintains any necessary state information that applies to the conversion of a particular stream of text.

destination encoding
The text encoding that describes the desired encoding of the text after conversion. Compare source encoding.

direct conversion
A text conversion by the Text Encoding Converter that can be handled in one step (that is, by one call to a single plug-in). Compare indirect conversion.

Extended UNIX Code (EUC)
A type of packing scheme that is used as the text encoding for UNIX workstations that handle East Asian languages. See also packing scheme.

fallback mapping
A character or sequence of characters used to replace a character that has no direct equivalent in the destination encoding. For example, if the target encoding does not contain “å,” a possible fallback mapping would be “aa.”

fragment
In Mac OS 9 and earlier, a fragment refers to a block of executable code or data. Fragments are handled by the Code Fragment Manager. See also Code Fragment Manager.

glyph image
A visual element used to represent one or more characters.

indirect conversion
A text conversion by the Text Encoding Converter that requires stepping through one or more intermediate conversions before reaching the desired destination encoding. Compare direct conversion.

Internet
The name given to the world-wide network of computers.

loose mapping
A mapping between text encodings that preserves the information content of text but does not permit round-trip fidelity.

Multipurpose Internet Mail Extensions (MIME)
Mechanisms for specifying and describing the format of Internet message bodies.

packing scheme
A type of character encoding scheme where characters are encoded using a variable number of bytes. Typically certain bytes signal the beginning of a character and how many additional bytes are used to encode the character. Character sets with a large number of elements are often stored using a packing scheme. See also character encoding scheme.

perfect round-trip conversion
This occurs when mapping a character from a particular source encoding to a particular destination encoding (usually Unicode) and then back to the source encoding again yields the original character.

plug in
See text encoding conversion plug-in.

presentation form
An abstraction of a range of glyph images, which represents a standard way to display a particular character or group of characters in a particular context as specified by a particular writing system. See also glyph image.

script
A collection of related characters, subsets of which are required to write a particular language. Some examples of scripts are Latin, Greek, Hiragana, Katakana, and Han.

sniffer
A function included with a text conversion plug-in that scans text for features that identify a particular text encoding.

source encoding
The text encoding that describes the encoding of the text before conversion. Compare destination encoding.

strict mapping
A mapping between text encodings that preserves the information content of text and permits round-trip fidelity.

text element
A group of one or more characters that is treated as a single entity for a particular process such as collation, display, or transcoding.

text encoding
The coded character set or character encoding scheme used to represent a particular piece of text. See also coded character set, character encoding scheme.

text encoding base
The primary specification of a text encoding, and one component of a text encoding specification. See also text encoding specification, text encoding variant, text encoding format.

text encoding conversion plug-in
A code fragment that provides conversion services between pairs of encodings. A text encoding conversion plug-in informs the Text Encoding Conversion Manager about its conversion and encoding analysis capabilities

text encoding format
A subset of the text encoding specification that specifies the byte format of the encoding. For example, a format might specify that the encoding take up only 7-bits for transmission over 7-bit channels. See also text encoding specification, universal transformation format.

text encoding specification
A scalar value that defines a text encoding to be used in a conversion. It includes information about the text encoding base, the text encoding variant, and the text encoding format.

text encoding variant
A specification of one among possibly several minor variants or subsets of a particular text encoding base. See also text encoding specification, text encoding base.

Text Encoding Conversion Manager
A pair of shared library extensions—namely, the Text Encoding Converter and the Unicode Converter—that facilitate text encoding conversion on Mac OS–based computers

Text Encoding Converter
A shared library extension that provides the services for general and algorithmic encoding conversions or multi-encoding streams. The Text Encoding Converter sometimes uses the Unicode Converter.

transfer encoding syntax
A transformation applied to text encoded using a character encoding scheme to allow it to be transmitted by a specific protocol or set of protocols. This is normally used to permit 8-bit data to be sent through channels that can only handle 7-bit values. Also called content transfer encoding. Compare character encoding scheme, universal transformation format.

Unicode
A universal character set that includes tens of thousands of characters covering the world’s major written languages along with many symbols.

Unicode Converter
A shared library extension that provides table-based conversion between no-subset variants of Unicode, in either UTF-16 or UTF-8, and many other encodings.

universal transformation format
Special formats that allow transmission of Unicode characters over 7-bit (UTF-7) and 8-bit (UTF-8) channels. See also transfer encoding syntax.

writing system
A set of characters from one or more scripts that are used to write a particular language and the rules that govern the presentation of those characters.

< Previous Page

Hide TOC

© 2005 Apple Computer, Inc. All Rights Reserved. (Last updated: 2005-07-07)

Did this document help you?
Yes: Tell us what works for you. It’s good, but: Report typos, inaccuracies, and so forth. It wasn’t helpful: Tell us what would have helped.