Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

Up Previous Next 

PATH 
Mac OS 8 and 9 Developer Documentation > Text Encoding Conversion Manager
Programming With the Text Encoding Conversion Manager



ConvertFromUnicodeToTextRun

Converts a string from Unicode to one or more encodings.

pascal OSStatus ConvertFromUnicodeToTextRun (
                     UnicodeToTextRunInfo iUnicodeToTextInfo,
                     ByteCount iUnicodeLen,
                     ConstUniCharArrayPtr iUnicodeStr,
                     OptionBits iControlFlags,
                     ItemCount iOffsetCount,
                     ByteOffset iOffsetArray[],
                     ItemCount *oOffsetCount,
                     ByteOffset oOffsetArray[],
                     ByteCount iOutputBufLen,
                     ByteCount *oInputRead,
                     ByteCount *oOutputLen,
                     LogicalAddress oOutputStr,
                     ItemCount iEncodingRunBufLen,
                     ItemCount *oEncodingRunOutLen,
                     TextEncodingRun oEncodingRuns[]);
iUnicodeToTextInfo
You use the function CreateUnicodeToTextRunInfo, CreateUnicodeToTextRunInfoByEncoding, or CreateUnicodeToTextRunInfoByScriptCode to obtain a Unicode converter object to specify for this parameter.

iUnicodeLen
The length in bytes of the Unicode string to be converted.

iUnicodeStr
A pointer to the Unicode string to be converted.

iControlFlags
Conversion control flags. The following constants define the masks for control flags valid for this parameter. You can use these masks to set the iControlFlags parameter: kUnicodeUseFallbacksMask kUnicodeKeepInfoMask kUnicodeVerticalFormMask kUnicodeLooseMappingsMask kUnicodeStringUnterminatedMask kUnicodeTextRunMask kUnicodeKeepSameEncodingMask kUnicodeForceASCIIRangeMask kUnicodeNoHalfwidthCharsMask kUnicodeTextRunHeuristicsMask

You can also set one of the following directionality masks: kUnicodeDefaultDirectionMask kUnicodeLeftToRightMask kUnicodeRightToLeftMask

For a description of these control flags, see Conversion Control Flags .

If the text-run control flag is clear, ConvertFromUnicodeToTextRun attempts to convert the Unicode text to the single encoding it chooses from the list of encodings in the Unicode mapping structures array that you provide when you create the Unicode converter object. This is the encoding that produces the best result, that is, that provides for the greatest amount of source text conversion. If the complete source text can be converted into more than one of the encodings specified in the Unicode mapping structures array, then the converter chooses among them based on their order in the array. If this flag is clear, the oEncodingRuns parameter always points to a value equal to 1.

If you set the use-fallbacks control flag, the converter uses the default fallback characters for the current encoding. If the converter cannot handle a character using the current encoding, even using fallbacks, the converter attempts to convert the character using the other encodings, beginning with the first encoding specified in the list and skipping the encoding where it failed.

If you set the kUnicodeTextRunBit control flag, the converter attempts to convert the complete Unicode text string into the first encoding specified in the Unicode mapping structures array you passed to CreateUnicodeToTextRunInfo, CreateUnicodeToTextRunInfoByEncoding, or CreateUnicodeToTextRunInfoByScriptCode when you created the Unicode converter object for this conversion. If it cannot do this, the converter then attempts to convert the first text element that failed to the remaining encodings, in their specified order in the array. What the converter does with the next text element depends on the setting of the keep-same-encoding control flag.

If the keep-same-encoding control flag is clear and the text-run heuristics control flags is clear, the converter returns to the original encoding and attempts to continue conversion with that encoding; this is equivalent to converting each text element to the first encoding that works, in the order specified. If the text-run heuristics control flags is set, the converter does not return to the original encoding for common characters such as space and punctuation that are present in most encodings and shared by many writing systems.

If the keep-same-encoding control flag is set, the converter continues with the new destination encoding until it encounters a text element that cannot be converted using the new encoding. This attempts to minimize the number of encoding changes in the output text. When the converter cannot convert a text element using any of the encodings in the list and the Unicode-keep-same-encoding control flag is set, the converter uses the fallbacks default characters for the current encoding.

iOffsetCount
The number of offsets in the array pointed to by the iOffsetArray parameter. Your application supplies this value. If you don't want offsets returned to you, specify 0 (zero) for this parameter.

iOffsetArray
An array of type ByteOffset. On input, you specify the array that contains an ordered list of significant byte offsets pertaining to the source Unicode string. These offsets may identify font or style changes, for example, in the Unicode string. If you don't want offsets returned to your application, specify NULL for this parameter and 0 (zero) for iOffsetCount. All offsets must be less than iUnicodeLen.

oOffsetCount
A pointer to a value of type ItemCount. On output, this value contains the number of offsets that were mapped in the output stream.

oOffsetArray
An array of type ByteOffset. On output, this array contains the corresponding new offsets for the resulting converted string.

iOutputBufLen
The length in bytes of the output buffer pointed to by the oOutputStr parameter. Your application supplies this buffer to hold the returned converted string. The oOutputLen parameter may return a byte count that is less than this value if the converted byte string is smaller than the buffer size you allocated.

oInputRead
A pointer to a value of type ByteCount. On output, this value contains the number of bytes of the Unicode source string that were converted. If the function returns a result code other than noErr, then this parameter returns the number of bytes that were converted before the error occurred.

oOutputLen
A pointer to a value of type ByteCount. On output, this value contains the length in bytes of the converted string.

oOutputStr
A value of type LogicalAddress. On input, this value points to the start of the buffer for the converted string. On output, this buffer contains the converted string in one or more encodings. When an error occurs, the ConvertFromUnicodeToTextRun function returns the converted string up to the character that caused the error. (For guidelines on estimating the size of the buffer needed, see the discussion following the parameter descriptions.)

iEncodingRunBufLen
The number of text encoding run elements you allocated for the encoding run array pointed to by the oEncodingRuns parameter. The converter returns the number of valid encoding runs in the location pointed to by oEncodingRunOutLen. Each entry in the encoding runs array specifies the beginning offset in the converted text and its associated text encoding.

oEncodingRunOutLen
A pointer to a value of type ItemCount. On output, this value contains the number of valid encoding runs returned in the oEncodingRuns parameter.

oEncodingRuns
On input, an array of structures of type TextEncodingRun . Your application should allocate an array with the number of elements you specify in the iEncodingRunBufLen parameter. On output, this array contains the encoding runs for the converted text string. Each entry in the encoding run array specifies the beginning offset in the converted text string and the associated encoding specification.

function result
A result code. The result codes are the same as those for the function ConvertFromUnicodeToText, with the following additional possibility: If the function returns kTECArrayFullErr, then the oEncodingRuns array was too small for all of the encodings runs in the output text, and the input was not completely converted. As you would if kTECOutputBufferFullErr was returned, you can call the function again with another output buffer--or with the same output buffer after copying its contents--to convert the remainder of the Unicode string.

DISCUSSION

To use the ConvertFromUnicodeToTextRun function, you must first set up an array of structures of type UnicodeMapping containing, in order of precedence, the mapping information for the conversion. To create a Unicode converter object, you call the CreateUnicodeToTextRunInfo function passing it the Unicode mapping array, or you can the CreateUnicodeToTextRunInfoByEncoding or CreateUnicodeToTextRunInfoByScriptCode functions, which take arrays of text encodings or script codes instead of an array of Unicode mappings. You pass the returned Unicode converter object as the iUnicodeToTextInfo parameter when you call the ConvertFromUnicodeToTextRun function.

Two of the control flags that you can set for the iControlFlags parameter allow you to control how the Unicode Converter uses the multiple encodings in converting the text string. These flags are explained in the description of the iControlFlags parameter. Here is a summary of how to use these two control flags:

The ConvertFromUnicodeToTextRun function returns the converted string in the array pointed to by the oOutputStr parameter. Beginning with the first text element in the oOutputStr array, the elements of the array pointed to by the oEncodingRuns parameter identify the encodings of the converted string. The number of elements in the oEncodingRuns array may not correspond to the number of elements in the oOutputStr array. This is because the oEncodingRuns array includes only elements for the beginning of each new encoding run in the converted string.

SEE ALSO

The function ConvertFromUnicodeToScriptCodeRun


© 1999 Apple Computer, Inc. – (Last Updated 13 Dec 99)

Up Previous Next