On Mac OS X v10.5, Core Data provides an infrastructure to support versioning of managed object models and migration of data from one schema to another (see Core Data Model Versioning and Data Migration Programming Guide). This article describes how you can implement versioning yourself on Mac OS X v10.4.
As applications evolve over time, it is often the case that the schema changes. This article gives an overview of how you can migrate data from a store using one schema to another store using a different schema.
Important: This is a preliminary document. Although this document has been reviewed for technical accuracy, it is not final. Apple Computer is supplying this information to help you plan for the adoption of the technologies and programming interfaces described herein. Newer versions of this document may be provided in the future. For information about updates to this and other developer documentation, view the New & Updated sidebars in subsequent releases of the Reference Library.
Versioning Issues
General Technique
Migrating Data
Development Strategies
Core Data stores are conceptually bound to the managed object model used to create them. If you change any part of a model that alters the actual schema, this renders it incompatible with (and so unable to open) the stores it previously created. For example, if you add a new entity or a new attribute to an existing entity (which does change the schema), you will not be able to open old stores; if you add a validation constraint or set a new default value for an attribute (which doesn’t change the actual schema), you will be able to open old stores. If you change your schema, you therefore need to migrate the data in existing stores to new version.
Migrating data from a store with one schema to a different store using a different schema is an extremely hard problem to solve in a general purpose fashion that is both flexible and exhibits good performance. Core Data does not provide a generic mechanism to assist in this, so if you change your application's model you must migrate your data yourself. The typical steps you should take are as follows:
When you save a store, put a version number in the metadata. If a store does not have a version number key, you typically treat it as version 1. (For an example of how to set the metadata in an application that uses NSPersistentDocument
, see NSPersistentDocument Core Data Tutorial.)
Before opening a store, first retrieve the metadata and check the version number.
You do not need a model (or to know the type of the store) to retrieve the metadata—you can use NSPersistentStoreCoordinator
's metadataForPersistentStoreWithURL:error:
method without even creating a persistence stack.
If the store version number is the current version, simply open the store and continue as normal.
If the store version number is a previous version, then:
Retrieve the appropriate previous model.
Initialize an "importer" Core Data stack with the previous model and the store you want to open.
Initialize the new Core Data stack with the current model and—if it's available—the store you want to save to.
Fetch all the objects from the old store and copy them to the new store, mapping from from the old to the new schema as necessary.
A complete example is provided in the CoreRecipes sample code.
The majority of work is involved in mapping from from the old to the new schema (step 4(d) above). Where appropriate, you need to map from instances of an old entity to a new entity and from old attributes to new (setting default values where necessary), and you need to ensure that relationships are properly maintained.
In the simplest case—if your data set is small—you can fetch everything into memory using the original model and convert all your objects in a single pass. This approach reduces the likelihood of errors in conversion.
You iterate through your (old) model taking the entities one by one, fetching all the objects for that entity using a managed object context associated with the "importer" persistence stack.
For each managed object, you create a corresponding new object using the new model and the new persistence stack. You iterate through the object's attributes and relationships (as defined by its entity description) making copies. "Copying" a relationship requires creating a new managed object for the destination and filling in its properties.
You handle the new IDs in the new store with a dictionary, mapping the object IDs from the old store into the new object IDs.
If you need to avoid having two versions of the same class in the same runtime or if you do not want to rename your model classes with each conflicting update to the model, you can load a previous version of your model and, before using it, edit all entities such that they no longer use custom classes (so that all entities are instantiated using NSManagedObject
). Then you can use this model to load the data from the old store and populate the new store using your new model. You can also temporarily disable validation if necessary.
If you have a very large data set that it is impractical to bring into memory all at once, you can adopt other strategies. You can iterate through your (old) model and fetch the entities one by one. You can start by converting all entities that are relatively standalone (for example, those that are more or less select lists like categories or priorities or other items that typically show up in pop-up menus in the user interface), creating an old-global ID to new-global ID mapping as each is converted. You then traverse the collection of "root" entities one by one, making new instances in the new model, converting as needed. You can limit memory footprint first by restricting the number of instances you use at any point in through the use of selective fetch requests, then by trimming the object graph as necessary (see “Memory Management Using Core Data”).
Implementing a proper versioning strategy is a non-trivial task. Nevertheless, an important consideration is that typically the model should not change frequently outside of the development environment. If you are mutating models at runtime (and you are not developing an application specifically for creating models), then you should consider whether your model sufficiently describes your application's data.
During the development process it is still likely that you will want to create test data sets, and recreating these for every iteration of the schema can be time-consuming. Nevertheless, time spent early in the project on supporting data migration is unlikely to be wasted in the long term if you need to support versioning in future releases.
If you are using test-driven development, you should write code in your tests that generates clean test data on the fly, rather than hand-craft data files containing test data or keep around the generated test data files. If you do this, you have no data migration issues to worry about—you just need to refactor the code that generates test data when you refactor your model. Refactoring tests and test data as the code under test changes is normal in test-driven development, and from this perspective you should view your data model as "code."
© 2004, 2009 Apple Inc. All Rights Reserved. (Last updated: 2009-03-04)