This chapter provides background information important for understanding and using Apple’s Core Audio Format (CAF) files.
CAF File Advantages
CAF File Structure
Types of Chunks
Apple’s Core Audio Format is a flexible, state-of-the-art file format for storing and manipulating digital audio data. It is fully supported by Core Audio APIs on Mac OS X v10.4 and later and on Mac OS X v10.3 with QuickTime 7 or later. CAF provides high performance and flexibility, and is scalable to future ultra-high resolution audio recording, editing, and playback.
CAF files have several advantages over other standard audio file formats:
Unrestricted file size
Whereas AIFF, AIFF-C, and WAV files are limited in size to 4 gigabytes, which might represent as little as 15 minutes of audio, CAF files use 64-bit file offsets, eliminating practical limits. A standard CAF file can hold audio data with a playback duration of hundreds of years.
Safe and efficient recording
Applications writing AIFF and WAV files must either update the data header’s size field at the end of recording—which can result in an unusable file if recording is interrupted before the header is finalized—or they must update the size field after recording each packet of data, which is inefficient. With CAF files, in contrast, an application can append new audio data to the end of the file in a manner that allows it to determine the amount of data even if the size field in the header has not been finalized.
Support for many data formats
CAF files serve as wrappers for a wide variety of audio data formats. The flexibility of the CAF file structure and the many types of metadata that can be recorded enable CAF files to be used with practically any type of audio data. Furthermore, CAF files can store any number of audio channels.
Support for many types of auxiliary data
In addition to audio data, CAF files can store text annotations, markers, channel layouts, and many other types of information that can help in the interpretation, analysis, or editing of the audio.
Support for data dependencies
Certain metadata in CAF files is linked to the audio data by an edit count value. You can use this value to determine when metadata has a dependency on the audio data and, furthermore, when the audio data has changed since the metadata was written.
CAF files begin with a file header, which identifies the file type and the CAF version, followed by a series of chunks. A chunk consists of a header, which defines the type of the chunk and indicates the size of its data section, followed by the chunk data. The nature and format of the data is specific to each type of chunk.
The only two chunk types required for every CAF file are the Audio Data chunk (which, as you might have guessed, contains the audio data) and the Audio Description chunk, which specifies the audio data format.
The Audio Description chunk must be the first chunk following
the file header. The Audio Data chunk can appear anywhere else in
the file, unless the size of its data section has not been determined.
In that case, the size field in the Audio Data chunk header is set
to -1
and the Audio Data
chunk must come last in the file so that the end of the audio data
chunk is the same as the end of the file. This placement allows
you to determine the data section size when that information is
not available in the size field.
Audio is stored in the Audio Data chunk as a sequential series of packets. An audio packet in a CAF file contains one or more frames of audio data.
CAF supports a wide range of other chunk types, which can be placed in any order in the file except first (reserved for the Audio Description chunk) or last (when the Audio Data chunk size field is set to -1). Some chunk types can be used more than once in a file. Some refer to—or are referred to by—chunks of other types.
Every chunk consists of a chunk header followed by a data section. Chunk headers contain two fields:
A four-character code indicating the chunk’s type
A number indicating the chunk size in bytes
The format of the data in a chunk depends on the chunk type. It consists of a series of sections, typically called fields. The format of the audio data depends on the data type. All of the other fields in a CAF file are in big-endian (network) byte order.
In order to understand this specification, it is important to understand the definitions of the following four terms:
Sample
One number for one channel of digitized audio data.
Frame
A set of samples representing one sample for each channel. The samples in a frame are intended to be played together (that is, simultaneously). Note that this definition might be different from the use of the term “frame” by codecs, video files, and audio or video processing applications.
Packet
The smallest, indivisible block of data. For linear PCM (pulse-code modulated) data, each packet contains exactly one frame. For compressed audio data formats, the number of frames in a packet depends on the encoding. For example, a packet of AAC represents 1024 frames of PCM. In some formats, the number of frames per packet varies.
Sample rate
The number of complete frames of samples per second of noncompressed or decompressed data.
This section briefly introduces the types of chunks defined in the CAF specification. All CAF chunk types are fully described in “Core Audio Format Specification.”
Every CAF file must include the following chunks:
Audio Description chunk, which describes the audio data format for the file. This chunk must follow immediately after the CAF file header. See “Audio Description Chunk.”
Audio Data chunk, containing the audio data for the file. If the data chunk’s size isn’t known, it must be the final chunk in the file. If this chunk’s header specifies the size, the chunk can appear anywhere after the Audio Description chunk. See “Audio Data Chunk.”
If the audio packets vary in size, the file must have a Packet Table chunk, which records the size of each packet. See “Packet Table Chunk.”
There is one chunk that is required for all CAF files with more than two channels:
Channel Layout chunk, which describes the role of each channel in the file. This chunk is optional for one- and two-channel files. See “Channel Layout Chunk.”
Some chunks refer to data in other, supporting chunks:
Some compressed audio data formats require additional codec-specific data in order to decode the audio data. If the audio format requires this data, the file must have a Magic Cookie chunk. See “Magic Cookie Chunk.”
Some chunks refer to text strings held in the Strings chunk. See “Strings Chunk.”
There are two chunks that you can use to place markers in the data file. These chunks share data types, described in “Marker Data Types”:
Marker chunks hold individual markers. See “Marker Chunk.”
Region chunks delineate segments of the audio data. See “Region Chunk”
There are two chunk types that store musical information:
Instrument chunks describe aspects of the audio data needed when the audio is used by a sampler or played as an instrument. See “Instrument Chunk.”
MIDI chunks store all of the information in a standard MIDI file. See “MIDI Chunk.”
Two chunks contain data for use by audio editors:
Overview chunks contain samples of the data useful for displaying the audio at a particular resolution. A CAF file can have any number of these; one for each resolution to be displayed. See “Overview Chunk.”
Peak chunks list the peak amplitude in each channel and specify the frame in which that amplitude occurs. See “Peak Chunk.”
There are two chunk types that hold annotations to the data:
Edit Comments chunks hold time-stamped comments added when the data is edited. See “Edit Comments Chunk.”
The Information chunk contains text strings that provide information about the audio data, such as key signature, artist, and title. See “Information Chunk.”
One chunk type can be used to uniquely identify the data:
The optional Unique Material Identifier (UMID) chunk provides a unique identifier for the audio data in a CAF file. There can be at most one UMID chunk in a file. See “Unique Material Identifier Chunk.”
You can define your own chunk type to extend the CAF file specification. There is a chunk type defined for this purpose:
The User-Defined chunk provides a universally unique ID (UUID) for a new chunk type. See “User-Defined Chunk.”
Many chunk types allow you to specify a larger chunk size than is currently needed for data in order to reserve additional space. There is also a special chunk you can use to reserve extra space in the CAF file as a whole:
The Free chunk contains no data, but reserves space that you can use later. See “Free Chunk.”
© 2005, 2006 Apple Computer, Inc. All Rights Reserved. (Last updated: 2006-03-08)