< Previous PageNext Page >

Hide TOC

Basics of Using QTKit Capture

The QuickTime Kit framework was developed by Apple to provide support for the most common media-related tasks of Cocoa and QuickTime developers. This support was accomplished by using certain abstractions and data types familiar to Cocoa programmers and by defining other abstractions and data types that were new––but only where necessary. The goal was to provide high-level Cocoa interfaces for playing and editing, importing and exporting various types of media. In response to the growing needs of the developer community, new methods have been added with each release to the five base classes that comprise the framework.

Now with the introduction of Mac OS X v10.5 and the latest iteration of QuickTime 7, the QuickTime Kit framework has made a major leap forward, providing support for capturing media from external sources, such as cameras and microphones, and outputting that media to QuickTime movies. Fifteen new classes have been added to the existing five in the first iteration of the framework. The goal is to provide Cocoa and QuickTime developers with a viable and robust alternative to using the procedural C sequence grabber API, which allowed applications to obtain digitized data from external sources, such as video boards. Using the QTKit capture API is now the preferred way of developing applications that support capture and recording of media.

This chapter describes at a basic level the QTKit capture architecture and implementation available in Mac OS X v10.5. You’ll gain an understanding of how you can capture, record, and output to various destinations by reading this chapter. Recording from an iSight camera or another DV device to a QuickTime file, for example, is one of the most common uses for the QTKit capture API.

To take advantage of this new API, you’ll need to read this chapter first before moving ahead to the following chapters which describe how to build a QTKit capture player application.

In this section:

Tasks Supported by QTKit Capture Classes
How QTKit Capture Works
Using the QTKit Capture API
Base Classes
Input and Output Classes
Utility Classes
Device Access and User Interface Classes

Tasks Supported by QTKit Capture Classes

As a high-level Objective-C framework, the QuickTime Kit is built on top of a number of other Mac OS X graphics and imaging technologies, including QuickTime, Core Image, Core Audio, Core Animation, Quartz 2D, and OpenGL. This means much of the work involved in dealing with the processing of video, audio, and image media in your application is already provided for you by the underlying Mac OS X graphics and imaging engines, thereby reducing the code you need to write, as well as the code overhead required in your Xcode projects.

The new capture classes and methods available in the QuickTime Kit provide frame-accurate audio/video synchronization, and frame-accurate capture, meaning you can specify precisely––with timecodes––when you want capturing to occur. You also have access to transport controls of your camcorder, so you can fast forward and rewind the tape.

Using these classes and methods, you can capture media from one or more external sources, including

Cameras
Microphones
Other external media devices, such as capture cards and tapedecks

The devices supported in Mac OS X v10.5 include:

VDC over USB, including the built-in iSight camera
IIDC over FireWire, includes external iSight
HDV, with Final Cut Pro, devices (with the appropriate codecs installed)
Core Audio HAL devices

Note that DV devices and Pro DV formats, such as DVCPro HD, require Final Cut Pro.

Important: With the introduction of the new, robust QTKit capture API, QuickTime developers are encouraged to move their development efforts away from usage of the component-based Sequence Grabber API.

After you’ve captured this media, you can record it to one or more output destinations, including but not necessarily limited to the following:

A QuickTime movie (.mov) file
A Cocoa view that previews video media captured from the input sources

Notably, as soon as you’ve captured media, you can also record the output to other destinations for use in custom-built applications and custom processing. This functionality is provided by the methods provided in the QTCaptureDecompressedVideoOutput and QTCaptureVideoPreviewOutput classes.

The next section discusses the types of capture objects you’ll use in working with the QuickTime Kit framework. A basic understanding of these objects is important in building your QTKit capture application player.

How QTKit Capture Works

All QTKit capture applications make use of three basic types of objects: capture inputs, capture outputs, and a capture session. Capture inputs, which are subclasses of QTCaptureInput, provide the necessary interfaces to different sources of captured media.

A capture device input, which is a QTCaptureDeviceInput object––a subclass of QTCaptureInput––provides an interface to capturing from various audio/video hardware, such as cameras and microphones. Capture outputs, which are subclasses of QTCaptureOutput, provide the necessary interfaces to various destinations for media, such as QuickTime movie files, or video and audio previews.

A capture session, which is a QTCaptureSession object, manages how media that is captured from connected input sources is distributed to connected output destinations. Each input and output has one or more connection, which represents a media stream of a certain QuickTime media type, such as video or audio media. A capture session will attempt to connect all input connections to each of its outputs.

As shown in Figure 1-1 a capture session works by connecting inputs to outputs in order to record and preview video from a camera.

Figure 1-1 Connecting inputs to outputs in a capture session

A capture session works by distributing the video from its single video input connection to a connection owned by each output. In addition to distributing separate media streams to each output, the capture session is also responsible for mixing the audio from multiple inputs down to a single stream.

Figure 1-2 shows how the capture session handles multiple audio inputs.

Figure 1-2 Handling multiple audio inputs in a capture session

As illustrated in Figure 1-2, a capture session sends all of its input video to each output that accepts video and all of its input audio to each output that accepts audio. However, before sending the separate audio stream to its outputs, it mixes them down to one stream that can be sent to a single capture connection.

A capture session is also responsible for ensuring that all media are synchronized to a single time base in order to guarantee that all output video and audio are synchronized.

The connections belonging to each input and output are QTCaptureConnection objects. These describe the media type and format of each stream taken from an input or sent to an output. By referencing a specific connection, your application can have finer-grained control over which media enters and leaves a session. Thus, you can enable and disable specific connections, and control specific attributes of the media entering (for example, the volumes of specific audio channels).

Using the QTKit Capture API

The QTKit capture API is comprised of fifteen new classes and more than several hundred new methods, notifications, and attributes. To better understand how you can take advantage of this API in your Cocoa or QuickTime application, you may want to read this section describing the various groupings of the API. The complete description of all these classes and their associated methods is available in the QTKit Framework Reference.

There are four base classes, six classes devoted to input and output, another three utility classes, one class that deals with device input, another with the user interface, and finally, a class containing constants and error messages.

Base Classes

There are four classes in this group that can best be described as base classes. Understanding how these work is essential to using the QTKit capture API.

The QTCaptureSession class provides an interface for connecting input sources to output destinations. The method used most commonly in this class is startRunning, which tells the receiver to start capturing data from its inputs and then to send that data to its outputs. Notably, if you’re using this method, when data does not need to be sent to file outputs, previews, or other outputs, your capture session should not be running, so that the overhead from capturing does not affect the performance of your application.

The QTCaptureInput and QTCaptureOutput classes, which are both abstract classes, provide interfaces for connecting inputs and outputs. An input source can have multiple connections, which is common for many cameras which have both audio and video output streams. Using QTCaptureOutput objects, you don’t need to have a fixed number of connections, but you do need a destination for your capture session and all of its input data.

Class	Group	Tasks	Most commonly used methods
`QTCaptureSession`	Base	Primary interface for capturing media streams; manages connections between inputs and outputs; also manages when a capture is running.	`startRunning`, `addInput:error:`, `addOutput:error:`
`QTCaptureInput`	Base	Provides input source connections for a `QTCaptureSession`. Use subclasses of this class for inputs of a session.	`connections`
`QTCaptureOutput`	Base	Provides an interface for connecting capture output destinations, such as QuickTime files and video previews, to a `QTCaptureSession`.	`connections`
`QTCaptureConnection`	Base	Represents a connection over which a single stream of media data is sent from a `QTCaptureInput` to a `QTCaptureSession` and from a `QTCaptureSession` to a `QTCaptureOutput`.	`formatDescription`, `mediaType`, `setEnabled:`, `isEnabled`

Input and Output Classes

There are five output classes and only one input class belonging to this group.

You can use the methods available in the QTCaptureDeviceInput class to handle input sources for various media devices, such as cameras and microphones. The five output classes provide output destinations for QTCaptureSession objects that can be used to write captured media to QuickTime movies, for example, or to preview video or audio that is being captured. QTCaptureFileOutput, an abstract superclass, provides an output destination for a capture session to write captured media simply to files.

Class	Group	Tasks	Most commonly used methods
`QTCaptureAudioPreviewOutput`	Input/Output	Represents an output destination for a`QTCaptureSession` that can be used to preview the audio being captured.	`volume`, `setVolume:`, `setOutputDeviceUniqueID:`
`QTCaptureDecompressedVideoOutput`	Input/Output	Represents an output destination for a `QTCaptureSession` object that can be used to process decompressed frames from the video being captured.	`setDelegate:`, `captureOutput:didOutputVideoFrame:withSampleBuffer:fromConnection:`
`QTCaptureDeviceInput`	Input/Output	Represents the input source for media devices, such as QuickTime files and video previews, to a `QTCaptureSession`.	`initWithDevice:`; returns an instance of `QTCaptureDeviceInput` associated with the given device.
`QTCaptureFileOutput`	Input/Output	Writes captured media to files and defines the interface for outputs that record media samples to files.	`recordToOutputFileURL`:, `setDelegate:`, `captureOutput:didFinishRecordingToOutputFileAtURL:forConnections:dueToError:`
`QTCaptureMovieFileOutput`	Input/Output	Represents an output destination for a `QTCaptureSession` that writes captured media to QuickTime movie files.	`recordToOutputFileURL:, setDelegate:`, `captureOutput:didFinishRecordingToOutputFileAtURL:forConnections:dueToError:`
`QTCaptureVideoPreviewOutput`	Input/Output	Represents an output destination for a`QTCaptureSession` that can be used to preview the video being captured.	`visualContextForConnection:`, `setDelegate:`, `captureOutput:didOutputVideoFrame:withSampleBuffer:fromConnection:`

Utility Classes

There are three classes belonging to this group: QTCompressionOptions, QTFormatDescription, and QTSampleBuffer. These are best characterized as utility classes, in that they perform tasks related to representing, for example, the compressions for particular media, or describing the formats of various media samples.

You can use QTCompressionOptions to describe compression options for all kinds of different media, using the compressionOptionsIdentifiersForMediaType: and mediaType methods. Compression options are created from presets keyed by a named identifier. These preset identifiers are listed in the QTKit Framework Reference in the chapter describing this class.

Using QTSampleBuffer objects, you can get information about sample buffer data that you may need to output or process the media samples in the buffer.

Class	Group	Tasks	Most commonly used methods
`QTCompressionOptions`	Utility	Represents a set of compression options for a particular type of media.	`compressionOptionsIdentifiersForMediaType:`, mediaType
`QTFormatDescription`	Utility	Describes the media format of media samples and of media sources, such as devices and capture connections.	`localizedFormatSummary`. The constant, `QTFormatDescriptionVideoCleanApertureDisplaySizeAttribute`
`QTSampleBuffer`	Utility	Provides format information, timing information, and metadata on media sample buffers.	`formatDescription`

Device Access and User Interface Classes

There are two classes in this particular group: QTCaptureDevice and QTCaptureView.

If you’re working with QTCaptureDevice objects, your application can read any number of extended attributes available to this class, using the deviceAttributes and attributeForKey: methods. Beyond that, you can use key-value coding to get and set attributes. If you wish to observe changes for a given attribute, you can add a key-value observer where the key path is the attribute key. Note that you cannot create instances of QTCaptureDevice directly.

You can use the methods available in the QTCaptureView class, which is a subclass of NSView, to preview video that is being processed by an instance of QTCaptureSession. The class creates and maintains its own QTCaptureVideoPreviewOutput to gather the preview video you need from the capture session.

Class	Group	Tasks	Most commonly used methods
`QTCaptureDevice`	Device Access and UI	Represents an available capture device.	`inputDevices`, `open:`, `isOpen`, `close`, `localizedDisplayName`
`QTCaptureView`	Device Access and UI	Displays a video preview of a capture session.	`setCaptureSession:`

< Previous PageNext Page >

Hide TOC

Did this document help you?
Yes: Tell us what works for you. It’s good, but: Report typos, inaccuracies, and so forth. It wasn’t helpful: Tell us what would have helped.