Q:
When using AddUserDataText I keep getting strange results when adding UTF-8 text that is intended for a Japanese OS. This API is documented as taking a text string with a language code (itlRegionTag ), how do I map UTF-8 strings to the correct language? It would be ideal to just work with CFStrings .
A: AddUserDataText , GetUserDataText and many other QuickTime APIs that take or return text strings assume that the text string is in one of the Traditional Mac OS language specific encodings, for example kTextEncodingMacJapanese . Therefore, the value of the itlRegionTag parameter passed to these APIs should be the language code, for example langJapanese .
If the string you have is UTF-8 ( or UTF-16 ), when using AddUserDataText you will have to convert it to the appropriate Traditional Mac OS language specific encoding.
CFString has the ability to do this conversion by calling CFStringGetBytes and passing in the appropriate CFStringEncoding .
Note: CFStringEncoding is an integer type for constants used to specify supported string encodings in various CFString functions; the values are exactly the same as the Text Encoding Converter's TextEncoding type and can be found in TextCommon.h .
You can map a Traditional Mac OS language code to the appropriate TextEncoding for CFStringGetBytes by calling GetTextEncodingFromScriptInfo which converts any combination of Traditional Mac OS script code, language code and region code to a TextEncoding .
Listing 1 demonstrates how to add a user data text item from a CFString using the above technique, while Listing 2 demonstrates retrieving a text user data item as a CFString . Because QuickTime requires language-tagged text, you will always need to use the Traditional Mac OS language codes found in Script.h with these UserData APIs.
Note: It is important to note that a conversion isn't always possible. For example, if you have a CFString containing a mixture of Japanese and Arabic you can't convert it to any single Traditional Mac OS encoding. The conversion will fail unless you're doing lossy conversion. CFStringGetBytes allows for lossy conversion by passing a "loss byte" to the function. If a character cannot be converted, CFStringGetBytes substitutes the "loss character" and conversion proceeds.
See Converting Between String Encodings for more information.
Listing 1: Adding a UserData item as text using a language code.
/* AddUserDataTextFromCFString
*
Description:
Add a user data item as text to a user data list from a CFString
performing character conversion to a specified language implemented
using a Traditional Mac OS encoding if possible
Parameters:
inUserData - the user data list for this operation
inUDType - the type that is to be assigned to the new item
inIndex - the item to which the text is to be added
inLanguageCode - a language code implemented using a particular Mac OS
encoding (eg. langEnglish, langJapanese etc.)
inCFString - a CFString containing the user data text to be added
Returns:
noErr or appropriate error code on failure
*
*/
OSStatus AddUserDataTextFromCFString(UserData inUserData, SInt32 inUDType, SInt32 inIndex,
SInt16 inLanguageCode, CFStringRef inCFString)
{
// the string encoding of the characters to copy, the values are the same
// as Text Encoding Converter TextEncoding
CFStringEncoding encoding = 0;
CFIndex numberOfCharsConverted = 0, usedBufferLength = 0;
CFRange range = { 0, CFStringGetLength(inCFString)};
OSStatus status;
// convert any combination of a Mac OS script code, a language code, a region code
// to a text encoding
// the CFString passed in should be in this encoding
status = GetTextEncodingFromScriptInfo(kTextScriptDontCare, inLanguageCode,
kTextRegionDontCare, &encoding);
if (noErr == status) {
// grab the characters from a CFString object into a byte buffer after
// converting the characters to a specified encoding
// we initially pass NULL for the destination buffer to make sure the
// conversion will succeed then we check to make sure the entire string can be
// converted as we are not using lossy conversion
numberOfCharsConverted = CFStringGetBytes(inCFString, range, encoding, 0, false,
NULL, 0, &usedBufferLength);
if ((numberOfCharsConverted == CFStringGetLength(inCFString)) && (usedBufferLength > 0)) {
// conversion will work so do it for real this time
Handle hData = NewHandleClear(usedBufferLength);
if (NULL != hData) {
HLock(hData);
numberOfCharsConverted = CFStringGetBytes(inCFString, range, encoding, 0,
false, *hData, usedBufferLength,
&usedBufferLength);
status = AddUserDataText(inUserData, hData, inUDType, inIndex,
inLanguageCode);
DisposeHandle(hData);
} else {
status = MemError();
}
} else {
// conversion did not work
status = kTextUnsupportedEncodingErr;
}
}
return status;
}
Listing 2: Retrieving language-tagged UserData text as a CFString .
/* GetUserDataTextAsCFString
*
Description:
Retrieves language code tagged text from an item in a user data list
as a CFString performing character conversion to the appropriate text
encoding if possible
Parameters:
inUserData - the user data list for this operation
inUDType - the type that is to be assigned to the new item
inIndex - the item to which the text is to be added
inLanguageCode - a language code implemented using a particular
Mac OS encoding (langEnglish, langJapanese etc.)
Returns:
a CFString containing the text or NULL on failure
Note:
it is the responsibility of the caller to release the returned CFString
*
*/
CFStringRef GetUserDataTextAsCFString(UserData inUserData, SInt32 inUDType, SInt32 inIndex,
SInt16 inLanguageCode)
{
TextEncoding encoding = 0; // the encoding of the characters in the buffer
CFStringRef string = NULL;
Handle hData = NULL;
OSStatus status;
hData = NewHandle(0);
if (NULL == hData || noErr != MemError()) return NULL;
status = GetUserDataText(inUserData, hData, inUDType, inIndex, inLanguageCode);
if (noErr == status && (GetHandleSize(hData) > 0)) {
// convert any combination of a Mac OS script code, a language code, a region
// code to a text encoding
status = GetTextEncodingFromScriptInfo(kTextScriptDontCare, inLanguageCode,
kTextRegionDontCare, &encoding);
if (noErr == status) {
// create a CFString object from a buffer containing characters in a
// specified encoding
HLock(hData);
string = CFStringCreateWithBytes(kCFAllocatorDefault, (const char *)*hData,
GetHandleSize(hData), encoding, false);
}
}
DisposeHandle(hData);
return string;
}
References
Back to Top
Downloadables
Back to Top
Document Revision History
Date |
Notes |
2005-02-11 |
First Version |
Posted: 2005-02-11
|