< Previous PageNext Page > Hide TOC

Basic Performance Tips

This chapter offers practical advice for how to tune your programs. It offers suggestions of areas you should monitor with the performance tools and also provides a list of practical tips for improving performance.

In this section:

Common Areas to Monitor
Fundamental Optimization Tips


Common Areas to Monitor

Many performance problems can be traced to specific parts of your program. As you design and implement your code, you should monitor those areas to make sure they meet the performance targets you set.

Code for Your Program’s Key Tasks

As you design your program, consider the tasks or workflows that users will encounter the most. During your implementation phase, be sure to monitor the code for those tasks and make sure their performance does not drop below acceptable levels. If it does, you should take immediate actions to correct the problems.

The key tasks performed by a program varies from program to program. For example, a word processor might need to be fast during text input and display, while a file utility program would need to be fast at scanning the files and directories on a hard disk. It is up to you to decide which tasks your users are most likely to perform.

For information on how to identify and fix slow operations in your program, see Code Speed Performance Guidelines.

Drawing Code

Most programs do some amount of drawing. If your program uses only standard windows and controls, then you probably do not need to worry too much about drawing performance. However, if you do any custom drawing, you need to monitor your drawing code and make sure it is performing at acceptable levels. In particular, if you support any of the following, you should investigate ways to optimize your drawing code.

For information on how to optimize drawing performance, see Drawing Performance Guidelines.

Launch Time Initialization Code

Launch time is the time when you initialize your program’s data structures and prepare to receive user input. However, many programs do much more work at launch time than is necessary. In many cases, tasks performed at launch time can be deferred until after the application has started processing user events. This deferral gives the user the perception that your application is fast, which is a good first impression to make.

For applications that need to run in Mac OS X version 10.3.3 and earlier, another way to improve launch times is to prebind your application. Prebinding involves precalculating library address ranges and storing those values in your application binary. This step eliminates the need for the dynamic loader (dyld) to calculate those address ranges at launch time. Improvements in dyld for Mac OS X version 10.3.4 make prebinding largely unnecessary in that and later releases.

For information on how to improve launch-time performance, see Launch Time Performance Guidelines.

File Access Code

The file system is a bottleneck for getting information into memory and the CPU. In the time it takes to access a file, tens of millions of instructions may be executed. It is therefore imperative that you examine the way your program uses files to be sure that the files you use are needed and are used properly.

Minimizing the number of files you use is one way to improve file-related performance. When you must access files, do so judiciously and keep the following in mind:

For information on how to identify and fix file-related performance problems, see Launch Time Performance Guidelines.

Application Footprint

The size of your code can have a tremendous effect on system performance. The more memory pages used by your program, the fewer there are available for the system and other programs. This memory pressure can eventually lead to paging and an overall system slowdown.

Managing your code footprint is all about organizing your code and data structures. You need to make sure you have the right pieces in memory and that you are not causing any memory pages to be read or written unnecessarily. Some of the problems that cause a large memory footprint are as follows:

For information on how to find and fix code footprint problems, see Code Size Performance Guidelines.

Memory Allocation Code

Programs allocate memory for storing both permanent and temporary data structures. Each memory allocation has a cost associated with it, both in CPU time and in memory consumption. Understanding when your program allocates memory and how that memory is used can help you reduce both of those costs.

Understanding your program’s memory usage can help determine ways to reduce that usage. You can find out if autoreleased Objective-C objects are being deallocated before they cause too much paging. You can find memory leaks caused by bugs in your code. You can also watch the number of times you call malloc, which might point out places where you can reuse existing memory blocks rather than create new ones.

One important rule to follow when allocating memory is to be lazy. Defer memory allocations until you actually need the memory being used. For some additional ways you can be lazy with memory allocations, see “Be Lazy .”

For information about optimizing your memory allocation patterns, see Memory Usage Performance Guidelines.

Fundamental Optimization Tips

Before you begin implementing a new program, there are several performance enhancements you should consider adding. Although you might not be able to take advantage of all of these enhancements in every case, you should at least consider them during your design phase.

Use Event-Based Handlers

All modern Mac OS X applications should be using the Carbon Event Manager or other event-based model for responding to system events. The old way of retrieving events by polling the system is highly inefficient. In fact, when there are no events to process, polling code is a 100 percent waste of CPU time. Using more modern event-based APIs can lead to the following benefits:

The Cocoa framework incorporates Carbon Event Manager calls into its classes and methods to implement an event-driven model for you. Applications written in Cocoa automatically take advantage of this behavior and require no additional modifications. Carbon applications must support the Carbon Event Manager calls explicitly.

Event-based handlers are not limited to supporting user events, such as mouse and keyboard events. Each thread has its own run loop to provide on-demand responses to timers, network events, and other incoming data. Applications support run loops using either the Core Foundation (CFRunLoop) or Cocoa (NSRunLoop) interfaces.

Thread Your Program

Supporting multiple threads is a good way to improve both the perceived and actual performance of your program. On hardware containing multiple processors, a multithreaded program often has significantly better performance than a single-threaded program. By distributing tasks across all available processors, an application can perform multiple operations simultaneously. Even on a single-processor machine, the use of additional threads can provide a perceived speed boost by leaving your main thread free to handle user events.

Before you begin adding support for multiple threads, though, be sure to put some thought into how your program might use those threads effectively. Because threads require a fair amount of overhead to create, you should carefully choose which tasks you want to assign to separate threads. If all of your program’s tasks are small and performed at different times, you would probably not want to create separate threads for each one. Instead, creating a single long-lived worker thread might be more appropriate.

Another consideration with threading is how to protect your data structures. Problems can occur when multiple threads modify the same data without first checking to see if it is safe to do so. Your code needs to use locks rigorously to protect its data structures. You might also need to synchronize specific blocks of code to prevent them from being executed by multiple threads at once.

For information on how to support additional threads in your program, see Threading Programming Guide.

Use the Accelerate Framework

If your application performs a lot of mathematical computations on scalar data, you should consider using the Accelerate framework (Accelerate.framework) to accelerate those calculations. The Accelerate framework takes advantage of any available vector processing units (such as the PowerPC AltiVec extensions, also known as Velocity Engine, or the Intel x86 SSE extensions) to perform multiple calculations in parallel. By coding to the framework, you can avoid having to create separate code paths for each platform architecture. The Accelerate framework is highly tuned for all of the architectures Mac OS X supports.

Tools such as Shark can help point out portions of your program that might benefit from using the Accelerate framework. For more information about Shark and other tools, see “Performance Tools.”

Be Lazy

A very simple way to improve performance is to make sure your application does not perform any unnecessary work. Each moment of an application’s time should be spent responding to the user’s current request, not predicting future requests. If you do not need a resource right away, such as a nib file containing a preferences window, don’t load it. Such an action takes time to execute because it accesses the file system, and if the user never opens that preference window, the process of loading its nib file is a waste of time.

The basic rule is wait until the user requests something from your application, then use the necessary resources to fulfill the request. You should cache data only in situations where there is a measurable performance benefit. Preloading caches on the assumption that the rest of the application will run faster can actually degrade performance in low-memory situations. In such a situation, your cached data may be paged to disk before it can be used. Thus, any savings you gained by caching the data turn into a loss because that data must now be read from disk twice before it is ever used. If you really want to cache data, wait until a given operation has been performed once before you cache any data from it.

Some other things to be lazy about include the following:

Take Advantage of Perceived Performance

The perception of performance is just as effective as actual performance in many cases. Many program tasks can be performed in the background, on a separate thread, or at idle time. Doing this makes the program interface feel more responsive to the user. Of course, creating the perception of performance does not work in every case. For example, the perception may be lost if the data being processed in the background is needed by the user immediately.

As you design your program, think about which tasks can be moved to the background effectively. For example, if your program needed to scan a number of files, do it on a background thread. Similarly, if you need to perform lengthy calculations, do it in the background so that the user may continue to manipulate your program’s user interface.

Another way to improve perceived performance is to make sure your application launches quickly. At launch time, defer any tasks that do not contribute to the immediate presentation of your application interface. For example, defer the creation of large data structures you do not need immediately until after your application has finished launching. You should also avoid loading plug-ins until the moment their code is actually needed.

Use the Mach-O Binary Format

If you have a Carbon application that is based on the Code Fragment Manager Preferred Executable Format (PEF), you should consider switching to the Mach-O executable format for several reasons. Foremost among them is that Mach-O is designed and optimized for use with the Mac OS X virtual memory system. Other reasons include the following:

Although Mach-O is not supported in Mac OS 9, using Mach-O does not require you to abandon Mac OS 9 as a delivery platform. You can build an application package that runs a PEF binary in Mac OS 9 and a Mach-O binary in Mac OS X. This allows you to optimize your executable for each operating system that you wish to support. For more information, see Bundle Programming Guide.

For an overview of the Mach-O format and how you can take advantage of that format for performance tuning, see “Overview of the Mach-O Executable Format” in Code Size Performance Guidelines.



< Previous PageNext Page > Hide TOC


© 2004, 2006 Apple Computer, Inc. All Rights Reserved. (Last updated: 2006-10-03)


Did this document help you?
Yes: Tell us what works for you.
It’s good, but: Report typos, inaccuracies, and so forth.
It wasn’t helpful: Tell us what would have helped.