< Previous PageNext Page > Hide TOC

Miscellaneous Kernel Services

This chapter contains information about miscellaneous services provided by the Mac OS X kernel. For most projects, you will probably never need to use most of these services, but if you do, you will find it hard to do without them.

This chapter contains these sections: “Using Kernel Time Abstractions ,” “Boot Option Handling,” “Queues,” and “Installing Shutdown Hooks.”

Using Kernel Time Abstractions

There are two basic groups of time abstractions in the kernel. One group includes functions that provide delays and timed wake-ups. The other group includes functions and variables that provide the current wall clock time, the time used by a given process, and other similar information. This section describes both aspects of time from the perspective of the kernel.

Obtaining Time Information

There are a number of ways to get basic time information from within the kernel. The officially approved methods are those that Mach exports in kern/clock.h. These include the following:

void clock_get_uptime(uint64_t *result);
 
void clock_get_system_microtime(            uint32_t *secs,
                                            uint32_t *microsecs);
 
void clock_get_system_nanotime(             uint32_t *secs,
                                            uint32_t *nanosecs);
void clock_get_calendar_microtime(          uint32_t *secs,
                                            uint32_t *microsecs);
 
void clock_get_calendar_nanotime(           uint32_t *secs,
                                            uint32_t *nanosecs);
 

The function clock_get_uptime returns a value in AbsoluteTime units. For more information on using AbsoluteTime, see “Using Mach Absolute Time Functions.”

The functions clock_get_system_microtime and clock_get_system_nanotime return 32-bit integers containing seconds and microseconds or nanoseconds, respectively, representing the system uptime.

The functions clock_get_calendar_microtime and clock_get_calendar_nanotime return 32-bit integers containing seconds and microseconds or nanoseconds, respectively, representing the current calendar date and time since the epoch (January 1, 1970).

In some parts of the kernel, you may find other functions that return type mach_timespec_t. This type is similar to the traditional BSD struct timespec, except that fractions of a second are measured in nanoseconds instead of microseconds:

struct mach_timespec {
    unsigned int tv_sec;
    clock_res_t tv_nsec;
};
typedef struct mach_timespec *mach_timespec_t;

In addition to the traditional Mach functions, if you are writing code in BSD portions of the kernel you can also get the current calendar (wall clock) time as a BSD timeval, as well as find out the calendar time when the system was booted by doing the following:

#include <sys/kernel.h>
struct timeval tv=time; /* calendar time */
struct timeval tv_boot=boottime; /* calendar time when booting occurred  */

For other information, you should use the Mach functions listed previously.

Event and Timer Waits

Each part of the Mac OS X kernel has a distinct API for waiting a certain period of time. In most cases, you can call these functions from other parts of the kernel. The I/O Kit provides IODelay and IOSleep. Mach provides functions based on AbsoluteTime, as well as a few based on microseconds. BSD provides msleep.

Using IODelay and IOSleep

IODelay, provided by the I/O Kit, abstracts a timed spin. If you are delaying for a short period of time, and if you need to be guaranteed that your wait will not be stopped prematurely by delivery of asynchronous events, this is probably the best choice. If you need to delay for several seconds, however, this is a bad choice, because the CPU that executes the wait will spin until the time has elapsed, unable to handle any other processing.

IOSleep puts the currently executing thread to sleep for a certain period of time. There is no guarantee that your thread will execute after that period of time, nor is there a guarantee that your thread will not be awakened by some other event before the time has expired. It is roughly equivalent to the sleep call from user space in this regard.

The use of IODelay and IOSleep are straightforward. Their prototypes are:

IODelay(unsigned microseconds);
IOSleep(unsigned milliseconds);

Note the differing units. It is not practical to put a thread to sleep for periods measured in microseconds, and spinning for several milliseconds is also inappropriate.

Using Mach Absolute Time Functions

The following Mach time functions are commonly used. Several others are described in osfmk/kern/clock.h.

Note: These are not the same functions as those listed in kern/clock.h in the Kernel framework. These functions are not exposed to kernel extensions, and are only for use within the kernel itself.

void delay(uint64_t microseconds);
void clock_delay_until(uint64_t deadline);
void clock_absolutetime_interval_to_deadline(uint64_t abstime,
            uint64_t *result);
void nanoseconds_to_absolutetime(uint64_t nanoseconds, uint64_t  *result);
void absolutetime_to_nanoseconds(uint64_t abstime, uint64_t *result);

These functions are generally straightforward. However, a few points deserve explanation. Unless specifically stated, all times, deadlines, and so on, are measured in abstime units. The abstime unit is equal to the length of one bus cycle, so the duration is dependent on the bus speed of the computer. For this reason, Mach provides conversion routines between abstime units and nanoseconds.

Many time functions, however, provide time in seconds with nanosecond remainder. In this case, some conversion is necessary. For example, to obtain the current time as a mach abstime value, you might do the following:

uint32_t secpart;
uint32_t nsecpart;
uint64_t nsec, abstime;
 
clock_get_calendar_nanotime(&secpart, &nsecpart);
nsec = nsecpart + (1000000000ULL * secpart); //convert seconds to  nanoseconds.
nanoseconds_to_absolutetime(nsec, &abstime);

The abstime value is now stored in the variable abstime.

Using msleep

In addition to Mach and I/O Kit routines, BSD provides msleep, which is the recommended way to delay in the BSD portions of the kernel. In other parts of the kernel, you should either use wait_queue functions or use assert_wait and thread_wakeup functions, both of which are closely tied to the Mach scheduler, and are described in “Kernel Thread APIs.”

The msleep call is similar to a condition variable. It puts a thread to sleep until wakeup or wakeup_one is called on that channel. Unlike a condition variable, however, you can set a timeout measured in clock ticks. This means that it is both a synchronization call and a delay. The prototypes follow:

msleep(void *channel, lck_mtx_t *mtx, int priority, const char *wmesg,  struct  timespec *timeout);
msleep0(vvoid *channel, lck_mtx_t *mtx, int priority, const char  *wmesg, uint64_t  deadline);
wakeup(void *channel);
wakeup_one(void *channel);

The three sleep calls are similar except in the mechanism used for timeouts. The function msleep0 is not recommended for general use.

In these functions, channel is a unique identifier representing a single condition upon which you are waiting. Normally, when msleep is used, you are waiting for a change to occur in a data structure. In such cases, it is common to use the address of that data structure as the value for channel, as this ensures that no code elsewhere in the system will be using the same value.

The priority argument has two effects. First, when wakeup is called, threads are inserted in the scheduling queue at this priority. Second, the value of priority modifies signal delivery behavior. If the value of priority is negative, signal delivery cannot wake the thread early. If the bit (priority & PCATCH) is set, msleep0 does not call the continuation function upon waking up from sleep and returns a value of 1.

The subsystem argument is a short text string that represents the subsystem that is waiting on this channel. This is used solely for debugging purposes.

The timeout argument is used to set a maximum wait time. The thread may wake sooner, however, if wakeup or wakeup_one is called on the appropriate channel. It may also wake sooner if a signal is received, depending on the value of priority. In the case of msleep0, this is given as a mach abstime deadline. In the case of msleep, this is given in relative time (seconds and nanoseconds).

Handling Version Dependencies

Many time-related functions such as clock_get_uptime changed as a result of the transition to KPIs in Mac OS X v.10.4. While these changes result in a cleaner interface, this can prove challenging if you need to make a kernel extension that needs to obtain time information across multiple versions of Mac OS X in a kernel extension that would otherwise have no version dependencies (such as an I/O Kit KEXT).

Here is a list of time-related functions that are available in both pre-KPI and KPI versions of Mac OS X:

uint64_t mach_absolute_time(void);

Declared In: <mach/mach_time.h>

Dependency: com.apple.kernel.mach

This function returns a Mach absolute time value for the current wall clock time in units of uint64_t.

void microtime(struct timeval *tv);

Declared In: <sys/time.h>

Dependency: com.apple.kernel.bsd

This function returns a timeval struct containing the current wall clock time.

void microuptime(struct timeval *tv);

Declared In: <sys/time.h>

Dependency: com.apple.kernel.bsd

This function returns a timeval struct containing the current uptime.

void nanotime(struct timespec *ts);

Declared In: <sys/time.h>

Dependency: com.apple.kernel.bsd

This function returns a timespec struct containing the current wall clock time.

void nanouptime(struct timespec *ts);

Declared In: <sys/time.h>

Dependency: com.apple.kernel.bsd

This function returns a timespec struct containing the current uptime.

Note: The structure declarations for struct timeval and struct timespec differ between 10.3 and 10.4 in their use of int, int32_t, and long data types. However, because the structure packing for the underlying data types is identical in the 32-bit world, these structures are assignment compatible.

In addition to these APIs, the functionality marked __APPLE_API_UNSTABLE in <mach/time_value.h> was adopted as-is in Mac OS X v.10.4 and is no longer marked unstable.

Boot Option Handling

Mac OS X provides a simple parse routine, PE_parse_boot_arg, for basic boot argument passing. It supports both flags and numerical value assignment. For obtaining values, you write code similar to the following:

unsigned int argval;
 
if (PE_parse_boot_arg("argflag", &argval)) {
    /* check for reasonable value */
    if (argval < 10 || argval > 37)
        argval = 37;
} else {
    /* use default value */
    argval = 37;
}

Since PE_parse_boot_arg returns a nonzero value if the flag exists, you can check for the presence of a flag by using a flag that starts with a dash (-) and ignoring the value stored in argvalue.

The PE_parse_boot_arg function can also be used to get a string argument. To do this, you must pass in the address of an array of type char as the second argument. The behavior of PE_parse_boot_arg is undefined if a string is passed in for a numeric variable or vice versa. Its behavior is also undefined if a string exceeds the storage space allocated. Be sure to allow enough space for the largest reasonable string including a null delimiter. No attempt is made at bounds checking, since an overflow is generally a fatal error and should reasonably prevent booting.

Queues

As part of its BSD infrastructure, the Mac OS X kernel provides a number of basic support macros to simplify handling of linked lists and queues. These are implemented as C macros, and assume a standard C struct. As such, they are probably not suited for writing code in C++.

The basic types of lists and queues included are

SLIST is ideal for creating stacks or for handling large sets of data with few or no removals. Arbitrary removal, however, requires an O(n) traversal of the list.

STAILQ is similar to SLIST except that it maintains pointers to both ends of the queue. This makes it ideal for simple FIFO queues by adding entries at the tail and fetching entries from the head. Like SLIST, it is inefficient to remove arbitrary elements.

LIST is a doubly linked version of SLIST. The extra pointers require additional space, but allow O(1) (constant time) removal of arbitrary elements and bidirectional traversal.

TAILQ is a doubly linked version of STAILQ. Like LIST, the extra pointers require additional space, but allow O(1) (constant time) removal of arbitrary elements and bidirectional traversal.

Because their functionality is relatively simple, their use is equally straightforward. These macros can be found in xnu/bsd/sys/queue.h.

Installing Shutdown Hooks

Although Mac OS X does not have traditional BSD-style shutdown hooks, the I/O Kit provides equivalent functionality in recent versions. Since the I/O Kit provides this functionality, you must call it from C++ code.

To register for notification, you call registerSleepWakeInterest (described in IOKit/RootDomain.h) and register for sleep notification. If the system is about to be shut down, your handler is called with the message type kIOMessageSystemWillPowerOff. If the system is about to reboot, your handler gets the message type kIOMessageSystemWillRestart. If the system is about to reboot, your handler gets the message type kIOMessageSystemWillSleep.

If you no longer need to receive notification (for example, if your KEXT gets unloaded), be certain to release the notifier with IONofitier::release to avoid a kernel panic on shutdown.

For example, the following sample KEXT registers for sleep notifications, then logs a message with IOLog when a sleep notification occurs:

#include <IOKit/IOLib.h>
#include <IOKit/pwr_mgt/RootDomain.h>
#include <IOKit/pwr_mgt/IOPM.h>
#include <IOKit/IOService.h>
#include <IOKit/IONotifier.h>
 
#define ALLOW_SLEEP 1
 
IONotifier *notifier;
 
extern "C" {
 
IOReturn mySleepHandler( void * target, void * refCon,
    UInt32 messageType, IOService * provider,
    void * messageArgument, vm_size_t argSize )
{
    IOLog("Got sleep/wake notice.  Message type was %d\n", messageType);
#if ALLOW_SLEEP
    acknowledgeSleepWakeNotification(refCon);
#else
    vetoSleepWakeNotification(refCon);
#endif
    return 0;
}
 
kern_return_t sleepkext_start (kmod_info_t * ki, void * d) {
        void *myself = NULL; // Would pass the self pointer here if in a class instance
 
        notifier = registerPrioritySleepWakeInterest(
                &mySleepHandler, myself, NULL);
    return KERN_SUCCESS;
}
 
 
kern_return_t sleepkext_stop (kmod_info_t * ki, void * d) {
    notifier->remove();
    return KERN_SUCCESS;
}
 
} // extern "C"


< Previous PageNext Page > Hide TOC


© 2002, 2006 Apple Computer, Inc. All Rights Reserved. (Last updated: 2006-11-07)


Did this document help you?
Yes: Tell us what works for you.
It’s good, but: Report typos, inaccuracies, and so forth.
It wasn’t helpful: Tell us what would have helped.