Symbol Variants: Why Those Dollar Signs?

Since Mac OS X v10.4, in the system library, libSystem.dylib, new symbols that contain dollar signs ($) have been added. This release note explains why they are there, and how a developer might want to take advantage of them.

Software Evolution vs. Backward Compatibility

Software is always changing. New demands require new features to be implemented. Bugs are discovered and fixed. Hardware and lower layers of the OS change, sometimes requiring upper layers to adapt.

For a commercial operating system like Mac OS X, third-party software expects that the system software will always act the same way. This backward compatibility prolongs the users’ investment in software and fosters the notion of greater stability in the system.

The need to evolve the software often conflicts with the desire to provide backward compatibility, but innovative solutions can allow both. If you remember the (classic) Mac OS before Mac OS X, a Classic environment was created specifically to run old applications on Mac OS X. Similarly, the transition to Intel-based hardware spurred the creation of the Rosetta dynamic translation software to provide compatibility with PowerPC-based applications.

Early in the Mac OS X v10.4 timeframe, two major software initiatives required an equally innovative solution to maintain backward compatibility. First, to be certified for UNIX™ conformance, hundreds of system routines needed to be modified. Some changes were considered just bug fixes, but many of the changes required significant change in behavior to the largely BSD-style routines. These changes would surely break existing applications.

Note: With regard to Mac OS X, the general term, UNIX™ conformance, refers to the Single UNIX Specification, Version 3 (SUSv3), a combination of IEEE, ISO and Open Group standards, including that referred to as POSIX. Furthermore, UNIX 03 is the Open Group certified brand for SUSv3 conformance, which Mac OS X 10.5 is registered. Previous brands include UNIX 95 and UNIX 98, and so we sometimes refer to UNIX 2003 just to avoid date confusion.

Secondly, up until v10.4, there was no support for real (PowerPC) long double floating-point numbers (or more correctly, the long double type was the same size as the regular 64-bit double floating-point type). Support for a real, 128-bit long double type would mean incompatible API changes. Code compiled for one type of long double would crash or produce incorrect results when using routines for the other long double type.

Symbol Variants

One way to allow routines to behave in new ways for new code but maintain the legacy behavior for previously compiled code is to use symbol versioning. In symbol versioning, different code can have the same symbol name, but have different version numbers. Unfortunately, this would require lots of changes to the compiler, linker and binary file format; a major undertaking that would have to happen before the real work could even be started.

So as an alternative, a feature of the gcc compiler was used; the __asm command can be used to rename a symbol. A special suffix was added to the symbol, and by using different suffixes, we can generate families of symbol variants.

That is where the dollar sign comes in. It is used as a separator between the real symbol name and the variant name. Since it can’t occur in normal C code, this avoids the possibility of symbol name collision.

This is what it looks like:

% nm /usr/lib/libSystem.dylib

...

00086e9f T _fputs

0003a14f T _fputs$UNIX2003

...

By Mac OS X convention, all variables and routine names are automatically prefixed with an underscore. The _fputs symbol is the legacy variant, while the _fputs$UNIX2003 symbol is the new, UNIX™ conforming one. All programs previously built will only know about the _fputs symbol, and will continue to use it and get legacy behavior, while new code can link to the new _fputs$UNIX2003 symbol and get UNIX™ conforming behavior.

A symbol may have more than one suffix. For instance:

000a6905 T _ftw

000a6603 T _ftw$INODE64$UNIX2003

000a6dc7 T _ftw$UNIX2003

Note: Multiple suffixes are always ordered alphabetically so that there is only one legal arrangement of the suffixes.

Prototypes in Header Files

It is in the header files that the real magic occurs. For instance, to generate the _fputs$UNIX2003 symbol, we would need something like:

int fputs(const char * __restrict, FILE * __restrict) __asm("_fputs$UNIX2003");

In <stdio.h>, the actual prototype looks like:

int fputs(const char * __restrict, FILE * __restrict) __DARWIN_ALIAS(fputs);

The __DARWIN_ALIAS macro resolves to the necessary __asm command, as appropriate for the legacy or UNIX™ conforming variant.

Important: Including the appropriate header file for each system variable and routine used is now more important than ever. Use the −Wmissing-prototypes or equivalent option to the compiler to detect a global function without a prototype, and then see the manual page to find out what header file or files to include. (See the documentation for man(1) and manpages(5) for more information about manual pages.)

Important: As per convention, preprocessor macros beginning with two underscores are private to the implementation and should not be used by the user because they may change at any time. Preprocessor macros beginning with a single underscore are owned by the implementation, but can be used by the user when appropriate.

Preprocessor Macros Controlling the Variants

The UNIX™ conformance variants use the $UNIX2003 suffix.

Note: When the UNIX™ conformance variants are used, the feature test macro _DARWIN_FEATURE_UNIX_CONFORMANCE is defined. Its value is 3, corresponding to SUSv3. The macro will be undefined otherwise (legacy behavior).

Important: The work for UNIX™ conformance started in Mac OS 10.4, but was not completed until 10.5. Thus, in the 10.4 versions of libSystem.dylib, many of the conforming variant symbols (with the $UNIX2003 suffix) exist. The list is not complete, and the conforming behavior of the variant symbols may not be complete, so they should be avoided.

Because the 64-bit environment has no legacy to maintain, it was created to be UNIX™ conforming from the start, without the use of the $UNIX2003 suffix. So, for example, _fputs$UNIX2003 in 32-bit and _fputs in 64-bit will have the same conforming behavior.

As of Mac OS X 10.5, UNIX™ conformance is on by default, and newly compiled code will link against the UNIX™ conformance variants, unless overridden with the following five macros.

_POSIX_C_SOURCE and _XOPEN_SOURCE

The _POSIX_C_SOURCE and _XOPEN_SOURCE macros are often set to specify the various levels of standards support. On Mac OS X, only SUSv3 is supported, so the actual value of these macros is not used (but they are reset to appropriate values when necessary).

When either or both of these macros are set, the UNIX™ conforming variants will be used. In addition, unless _DARWIN_C_SOURCE is also set (see below), these macros will cause the hiding of any variable, routine, structure, etc., in covered header files that are not specified in the standards. (These extra definitions are referred to as extensions to the standards.) Thus, only SUSv3 definitions will be visible in those header files.

_DARWIN_C_SOURCE

The _DARWIN_C_SOURCE macro (defined to any value), causes the UNIX™ conforming variants to be used, but does not hide the extensions to the standards, as _POSIX_C_SOURCE and _XOPEN_SOURCE do. The _DARWIN_C_SOURCE macro can be used in conjunction with the _POSIX_C_SOURCE and _XOPEN_SOURCE macros, with the _DARWIN_C_SOURCE behavior overriding the other two, allowing the extensions to the standards to be visible.

In addition, the _DARWIN_C_SOURCE macro will enable a few other extensions to the standards. These extensions occur where the SUSv3 standard puts additional limitations on the functionality beyond that of legacy (and, typically, BSD) behavior. The extension variants use the $DARWIN_EXTSN suffix, and can also be enabled with separate macros. (See the macro descriptions below.)

Note: The difference between setting the _DARWIN_C_SOURCE macro and not setting any macro, is that the variants with the $DARWIN_EXTSN suffix will also be used.

_NONSTD_SOURCE: The _NONSTD_SOURCE macro can be used to turn off the default UNIX™ conformance, and allow code to be built with legacy behavior. However, this macro will produce a compiler error when any of the above macros are set.
Important: The _NONSTD_SOURCE macro is not recommended, and should not be used in new code. Its only use should be to allow temporary functionality of old code until it can properly be updated for UNIX™ compliance.
Note: The use of the _NONSTD_SOURCE macro in the 64-bit environment is illegal.

MACOSX_DEPLOYMENT_TARGET

When none of the previous four macros are set, the variants chosen are affected by the environment variable MACOSX_DEPLOYMENT_TARGET or the −mmacosx-version-min=... argument passed to the compiler. For example, you might pass −mmacosx-version-min=10.5 to the compiler or set MACOSX_DEPLOYMENT_TARGET=10.5 to target Mac OS X v10.5.

If you target version 10.5 or later, the UNIX™ conforming variants are used automatically. If you target version 10.4 or earlier, the legacy variants are used. (There are other side effects as well, such as disabling newer linker features.)

In Mac OS X 10.5 or later, if you do not use this or the previous four macros, the UNIX™ conforming variants are used by default.

In addition, if you target version 10.5 or later (or by default if you do not target a specific version), the Mac OS X 10.5 variants (those with the $1050 suffix) are used. These variants have significant new behavior that might cause previously compiled programs to misbehave. For example, the (legacy) _select routine imposed a minimum timeout value of 10 milliseconds; the new _select$1050 routine has no such minimum.

MACOSX_DEPLOYMENT_TARGET (32-bit PowerPC only)

Setting MACOSX_DEPLOYMENT_TARGET to 10.4 or later (or passing −mmacosx-version-min=10.4 or later to the compiler) enables 128-bit long double support. Routines that pass long doubles either directly (like strtold) or indirectly (like printf and family) have the 128-bit long double variant which uses the $LDBL128 suffix, while the legacy 64-bit long double variants are unadorned.

Important: There is also a variant with the $LDBL64 suffix; these symbols are currently unused, and might be used in the future, or even removed. Do not use them.

What happens when MACOSX_DEPLOYMENT_TARGET is set to 10.3 or earlier (or is not set at all) is more complicated. During the development of Mac OS X v10.4, it was desired to have 128-bit long double support be the default. However, when MACOSX_DEPLOYMENT_TARGET is not set, things would default to the behavior three releases before—that of 10.1. While it was possible to back-port the standard C library routines that used long double support to 10.3 (though not the math routines), they couldn’t be back-ported to 10.1 or 10.2.

Note: Only the latest update to 10.3 (10.3.9) contains the back-ported routines.

As a compromise, when MACOSX_DEPLOYMENT_TARGET is not set, or set to 10.3 or earlier, the header files would use a different variant, with the $LDBLStub suffix. The compiler would instruct the loader to link against /usr/lib/libSystemStubs.a, where the $LDBLStub suffixed symbols are defined. At runtime, these assembly language stubs try to lookup the symbol with the same base name and a $LDBL128 suffix, and if it finds it, uses it. Otherwise, it will call the unadorned symbol name. This allows the code to adapt to whichever symbols are actually available in the current system library.

Note: This is not without risk. Code compiled using 128-bit long doubles may still crash or give incorrect results on systems without 128-bit long double support. However, given that the actual use of long doubles is minimal, this scheme allows 128-bit long doubles to be the default, and at the same time, if long doubles are not actually used (or if the code adapts to using 64-bit long doubles), the code will safely run on previous versions of Mac OS X without 128-bit long double support.

Note: While the compiler will instruct the loader to link against /usr/lib/libSystemStubs.a, the loader itself does not know to do this. Projects that call the loader directly may fail to resolve the $LDBLStub variants. In these cases, calls to the loader should be replaced with calls to the compiler to do the linking.

On Mac OS X 10.5, not setting MACOSX_DEPLOYMENT_TARGET and not using −mmacosx-version-min will result in 10.5 behavior by default, so the $LDBL128 variants are used instead of the $LDBLStub variants. However, as before, if MACOSX_DEPLOYMENT_TARGET is set to 10.3 or earlier, the $LDBLStub variants are used.

Note: Passing the −mlong-double-64 option to the compiler also forces legacy 64-bit long double support.

Note: When legacy 64-bit long double support is used, the feature test macro _DARWIN_FEATURE_LONG_DOUBLE_IS_DOUBLE is defined and set to 1. When 128-bit long double support is used, this macro is undefined. It is also undefined for architectures other than 32-bit PowerPC.

_DARWIN_UNLIMITED_SELECT: Setting the _DARWIN_UNLIMITED_SELECT macro will select the extension variants of select() and pselect(), which uses the $DARWIN_EXTSN suffix. The extended versions do not fail if the first argument is greater than FD_SETSIZE. This was the original BSD behavior.
Note: Setting _DARWIN_C_SOURCE will also enable the extension variants.

_DARWIN_BETTER_REALPATH: Setting the _DARWIN_BETTER_REALPATH macro selects the extension variant of realpath(), which uses the $DARWIN_EXTSN suffix. The extended version uses a fast shortcut to determine the current working directory, but this shortcut does not fail if the parent directories are not readable, as is dictated by the standards.
Note: Setting _DARWIN_C_SOURCE will also enable the extension variant.

_DARWIN_USE_64_BIT_INODE: Setting the _DARWIN_USE_64_BIT_INODE macro selects the 64-bit inode variants (those with a $INODE64 suffix) of routines like stat() and readdir(), which return an ino_t value. The current default is to return legacy 32-bit ino_t values, but 64-bit ino_t values will become the default in the future. Structures containing ino_t fields, like struct stat, are larger than the 32-bit ino_t versions, may have different ordering of the fields (to improve packing efficiency) and may have entirely new fields.
Note: When 64-bit inode support is used, the feature test macro _DARWIN_FEATURE_64_BIT_INODE is defined and set to 1. It is undefined otherwise (for 32-bit inodes)
Important: Some of the routines returning ino_t values also have symbols ending in “64”, like stat64() and getmntinfo64(). These routines always return 64-bit ino_t values, and were used for early adoption before the 64-bit inode variants were available. These “early adoption” symbols will likely be deprecated in the future, so the use of the _DARWIN_USE_64_BIT_INODE macro for 64-bit inode variants is the recommended way to get 64-bit inodes.

Summary of Variants

Below is a table that summarizes the variant suffixes and the corresponding, controlling and feature test macros.

Variant Suffix	Controlling Preprocessor Macro	Feature Test Macro	Description
`$1050`	`MACOSX_DEPLOYMENT_TARGET`		Mac OS X 10.5 and later behavior change
`$DARWIN_EXTSN`	`_DARWIN_C_SOURCE` or specific extension macro		Extended behavior beyond standards
`$INODE64`	`_DARWIN_USE_64_BIT_INODE`	`_DARWIN_FEATURE_64_BIT_INODE`	64-bit `ino_t` values
`$LDBL128`	`MACOSX_DEPLOYMENT_TARGET`	`_DARWIN_FEATURE_LONG_DOUBLE_IS_DOUBLE` (negation)	128-bit long double support (32-bit PowerPC only)
`$UNIX2003`	`_POSIX_C_SOURCE` `_XOPEN_SOURCE` `_DARWIN_C_SOURCE` `_NONSTD_SOURCE` `MACOSX_DEPLOYMENT_TARGET`	`_DARWIN_FEATURE_UNIX_CONFORMANCE`	UNIX™ conformance
`$NOCANCEL` `$fenv_access_off`			(used internally)
`$LDBL64`			(unused; may be removed)

gdb Support

When setting a breakpoint at a routine that has multiple variants, gdb will set a breakpoint at each variant. The delete command can be used to remove any unwanted breakpoint at any of the variants. For example:

(gdb) b select
Breakpoint 1 at 0x85edd
Breakpoint 2 at 0x54592
Breakpoint 3 at 0x4ff50
Breakpoint 4 at 0x37b44
Breakpoint 5 at 0x37b09
warning: Multiple breakpoints were set.
Use the "delete" command to delete unwanted breakpoints.
(gdb) info breakpoints
Num Type           Disp Enb Address    What
1   breakpoint     keep y   0x00085edd <select+6>
2   breakpoint     keep y   0x00054592 <select$UNIX2003+6>
3   breakpoint     keep y   0x0004ff50 <select$DARWIN_EXTSN>
4   breakpoint     keep y   0x00037b44 <select$DARWIN_EXTSN$NOCANCEL>
5   breakpoint     keep y   0x00037b09 <select$NOCANCEL$UNIX2003+6>

Did this document help you?
Yes: Tell us what works for you. It’s good, but: Report typos, inaccuracies, and so forth. It wasn’t helpful: Tell us what would have helped.