Important: The information in this document is obsolete and should not be used for new development.
The 'itl2' Sorting Hooks
The string-manipulation resource contains five sorting hooks, each of which can modify the functioning of its equivalent default sorting routine that is built into Text Utilities. If the sorting hooks are all empty, the default U.S. Roman sorting behavior results. For example, the'itl2'
resource in the version of the Roman script system that has been localized for the United States contains the built-in sorting behavior and empty hooks. For other script systems, one or more of the hooks are replaced with actual routines, to handle characters that need to be sorted differently from the default--for example, the Spanish character combination "rr" or the Norwegian "ñ". Most of the sorting routines are called in turn for each character in each string of a pair that are being compared. Here is what each of the routines does:
For information on providing custom versions of the sorting hooks, see "Supplying Custom Sorting Routines" on page B-43.
- Init routine. The init routine prepares two strings for comparison. The Text Utilities sorting routines compare a pair of strings byte for byte, and pass control to the init routine as soon as a pair of unequal byte values occur. All the init routine does is check to see if either of the byte values is the second byte of a 2-byte character (or other sorting unit, such as "rr" in Spanish). If it is, the init routine backs up one byte in the string, and passes control to the fetch routine.
- Fetch routine. The fetch routine fetches the next sorting unit from each string, taking into account whether the unit is composed of one or two bytes. Many, though not all, characters in 2-byte script systems are 2 bytes long. Character combinations in 1-byte scripts can also be considered as single sorting units during sorting--such as "ch" in Spanish and "dz" in Croatian. For example, consider the second characters in these two strings:
b c h a
b c a dIn analyzing the second sorting unit of each string, English versions of the fetch routine would return "c" in each case. Spanish versions, which combine "c" and "h" into a singular sorting unit "ch", would return "ch" for the upper string and "c" for the lower string.
- Project routine. The project routine defines the primary sorting position for the individual sorting unit passed to it. In the example just presented, the English version of the project routine would give the same result for the second sorting unit of each string, whereas the Spanish version would give them different values.
Where secondary sorting exists, the project routine "projects" each character into the sorting position of its equivalent primary character (perhaps uppercase with diacritical marks stripped). For example, consider the following two strings:
b C a d f
B c å d gThe project routine would give identical results for all the character pairs until passed the "f" and "g". In terms of the project routine, the strings would be sorted as if
they wereB C A D F
B C A D GThe Text Utilities use the project routine to establish decision characters to be used later if a primary difference is not available. The first pair of sorting units that have the same projected position but are not byte-for-byte identical is saved from this. Those decision characters are acted upon by the vernier routine.
- Vernier routine. The vernier routine is the tie breaker that determines the sorting order for strings that are equivalent in terms of primary sorting. It defines the secondary sorting position for the sorting unit passed to it. Suppose, in the previous example, that the strings were
b c a d f
b c å d fPrimary equivalence exists between the two strings. The decision characters "a" and "å" are passed in turn to the vernier routine; the vernier routine passes back a sorting position for each one. The return values determine whether "a" sorts before or after "å", and thus establish the sorting order for the strings.
- Exit routine. This sorting hook exists to allow for any needed post-processing after the sorting order for a pair of strings has been determined. It is called just before the Text Utilities string-comparison routine returns to the caller.