Mac OS X supports existing and forthcoming standards for the identification of languages and locales. All versions of Mac OS X support the International Organization for Standardization (ISO) naming conventions for identifying language and locale information. Specifically, Mac OS X supports the BCP 47 specification for identifying languages.
Important: If your software runs in versions of Mac OS X prior to version 10.4, you must continue to use the existing ISO language and locale ID conventions. Use of the tags found in the BCP 47 specification will not work on versions of Mac OS X prior to 10.4.
Using the available conventions, you can distinguish between different languages and between different regional dialects of a single language. The following sections show you how to specify this information in your code.
Language Designations
Regional Designations
Language and Locale IDs
Language-Specific Project Directories
Getting Language Names from Designators
Using Custom Designators
Legacy Language Designators
For language designations, Mac OS X supports both ISO 639-1 and ISO 639-2 conventions. The ISO 639-1 specification uses a two-letter code to identify a language and is the preferred way to identify languages in Mac OS X. However, if an ISO 639-1 code is not available for a particular language, you may also use the three-letter designators defined by the ISO 639-2 specification. Table 1 lists ISO designators for a subset of languages. Note that there is no ISO 639-1 designator for Hawaiian and so you must use the ISO 639-2 designator.
Language | ISO 639-1 | ISO 639-2 |
---|---|---|
English |
|
|
French |
|
|
German |
|
|
Japanese |
|
|
Hawaiian | no designator |
|
Note: For a complete list of ISO 639-1 and ISO 639-2 codes, go to http://www.loc.gov/standards/iso639-2/php/English_list.php.
For regional designations, Mac OS X supports the ISO 3166-1 conventions. This specification uses a two-letter, capitalized code to identify a specific country. By catenating a language designator with an underscore character and a regional designator, you get a designator that identifies the locale for a specific language and country. Table 2 lists the locale designators for a subset of languages and countries.
Regional dialect | Region Designator |
---|---|
English (United States) |
|
English (Great Britain) |
|
English (Australian) |
|
French (France) |
|
French (Canadian) |
|
Note: For a complete list of ISO 3166-1 codes, go to http://www.iso.ch.
A language ID designates a written language (or orthography) and can reflect either the generic language or a specific dialect of that language. To specify a language ID, you use a language designator by itself. To specify a specific dialect of a language, you use a hyphen to combine a language designator with a region designator. Thus, the English language as it is spoken in Great Britain would yield a language ID of en-GB
, while the English language spoken in the United States would have a language ID of en-US
.
A locale ID identifies a specific location where a given language is spoken. To specify a locale ID, use an underscore character to combine a language designator with a region designator. The locale ID for English-language speakers in Great Britain is en_GB
, while the locale for English-speaking residents of the United States is en_US
. Although locale IDs and language IDs might seem nearly identical, there is a subtle difference. A language ID identifies a written and spoken language only. A locale identifies a region and its conventions and has a more cultural context.
To illustrate the difference between language IDs and locale IDs, consider the following example. The dialect for a resident of Great Britain is specified by the code en-GB
. The commonly used locale for that same person is en_GB
. If you wanted to be very precise when specifying the locale, you could specify the locale code as en-GB_GB
. This specifies a person who speaks the British dialect of English and who resides in Great Britain. If that same person moved to the United States, the appropriate locale would be en-GB_US
, which would identify a person who speaks British English but uses the regional settings associated with the United States.
Mac OS X v10.4 and later supports the language ID tags defined in the BCP 47 specification. In addition to the ISO 3166-1 region codes, the draft of this standard (available at http://www.rfc-editor.org/) adds support for tags ranging in length from 3 to 8 characters. The use of these tags makes it possible to separate dialect or script information from a specific region or country.
Particularly in Chinese dialects, a region code is not always the best way to specify the proper dialect or script. For example, traditional Chinese (Han) is the default language spoken in Taiwan and is identified by the code zh_TW
in Mac OS X v10.3.9 and earlier. However, traditional Chinese is also commonly spoken in Hong Kong and Macao, which means the zh_TW
designator is not entirely accurate in those locations. The new standard defines new tags for the traditional Chinese (Hant
) and simplified Chinese (Hans
) scripts. Thus, traditional Chinese spoken in any country uses the code zh-Hant
. Traditional Chinese, as it is spoken in Taiwan, now uses the locale code zh-Hant_TW
.
Table 3 lists some of the other custom tags that identify a particular dialect or script.
Language ID | Description |
---|---|
| Azerbaijani in the Arabic script. |
| Azerbaijani in the Cyrillic script. |
| Azerbaijani in the Latin script. |
| Serbian in the Cyrillic script. |
| Serbian in the Latin script. |
| Uzbek in the Cyrillic script. |
| Uzbek in the Latin script. |
| Chinese in the simplified script. |
| Chinese in the traditional script. |
Important: In Mac OS X v10.4 and later, you can use the new Hant
and Hans
tags instead of the older zh_TW
and zh_CN
tags for .lproj
directory names if you choose. You must not use these tags (or any of the newer tags) in Mac OS X v10.3.9 and earlier, however. For these older applications, you should use the Core Foundation and Cocoa routines to obtain the canonical form of a given language or locale tag before using that tag in your code. For information on how to get the canonical tags, see “Getting Canonical Language and Locale IDs.”
The more general you make your localized resources, the more regions you can support with a single set of resources. This can save a lot of space in your bundle and helps reduce translation costs. For example, if you did not need to distinguish between different regions of the English language, you could include a single en.lproj
directory to support users in the United States, Great Britain, and Australia. Of course, the decision to use common resources over region-specific versions depends entirely on your product and the needs of your users.
Important: Even if you support region-specific localizations, you should always provide a complete set of common resources that are not region-specific.
When searching for resources, the system bundle routines try to find the best match between the .lproj
directories in your bundle and the user’s language and region preferences. The bundle routines look for the requested resource in region-specific directories first, followed by the more generalized language directory. Thus, if you had localizations for United States, Great Britain, and Australian users, the bundle routines would search the appropriate region directory (en_US.lproj
, en_GB.lproj
, or en_AU.lproj
) first, followed by the en.lproj
directory. For more information about how bundles search for resources, see “Searching for Bundle Resources” in Bundle Programming Guide.
Prior to Mac OS X v10.4, the ISO language and region designators are the recommended way of identifying languages in the system. However, few users can recognize languages by their ISO designators. If you need to display the actual name of a language to a user, you can use the Carbon functions defined in MacLocales.h
to convert the designators into localized language names. For more information, see Locale Utilities Reference.
If your software runs in Mac OS X v10.4 and later, you should not use the functions in MacLocales.h
. Instead, use the CFLocaleCopyDisplayNameForPropertyValue
function to get the correct display name for the language or locale ID.
It is possible to use a language or locale abbreviation that is not known to the CFBundle functions or NSBundle
class. For example, you could create your own language designations for a language that is not yet listed in the ISO conventions. Use of custom designators is discouraged, however.
If you choose to create a new designator, be sure to follow the rules found in sections 2.2.1 and 4.5 of BCP 47. Tags that do not follow these conventions are not guaranteed to work. When using custom tags, you must ensure that the abbreviation stored by the user’s language preferences matches the designator used by your .lproj
directory exactly.
In addition to the ISO language designators, previous versions of Mac OS X also supported a set of legacy designators. These designators let you specify a language by name, instead of by a two or three character code. Designators included names such as English
, French
, German
, Japanese
, Chinese
, Spanish
, Italian
, Swedish
, and Portuguese
among others. Although these names are still recognized and processed by the CFBundle functions or NSBundle
class, their use is deprecated and support for them in future versions of Mac OS X is not guaranteed. Use the codes described in “Language Designations” and “Regional Designations” instead.
© 2003, 2009 Apple Inc. All Rights Reserved. (Last updated: 2009-01-06)