Next Page > Hide TOC

Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

NSCharacterSet

Inherits from
Implements
Package
com.apple.cocoa.foundation
Companion guide

Overview

An NSCharacterSet object represents a set of Unicode 3.2 compliant characters. String and NSScanner objects use NSCharacterSets to group characters together for searching operations, so that they can find any of a particular set of characters during a search. The two classes, NSCharacterSet and NSMutableCharacterSet, declare the programmatic interface for static and dynamic character sets, respectively.

The objects you create using these classes are referred to as character set objects (and when no confusion will result, merely as character sets).

The NSCharacterSet class declares the programmatic interface for an object that manages a set of Unicode characters (see the NSStringReference class specification for information on Unicode). NSCharacterSet’s principal primitive method, characterIsMember, provides the basis for all other instance methods in its interface. A subclass of NSCharacterSet needs only to implement this method for proper behavior. For optimal performance, a subclass should also override bitmapRepresentation, which otherwise works by invoking characterIsMember for every possible Unicode value.

The mutable subclass of NSCharacterSet is NSMutableCharacterSet.

Tasks

Constructors

Creating a Standard Character Set

Opening a Character Set File

Testing Set Membership

Getting a Binary Representation

Deriving New Character Sets

Constructors

NSCharacterSet

Creates an empty NSCharacterSet.

public NSCharacterSet()

Creates a character set containing characters determined by the bitmap representation aData.

public NSCharacterSet(NSData aData)

Discussion

This capability is useful for creating a character set object with data from a file or other external data source.

Creates a character set containing characters whose Unicode values are given by aRange.

public NSCharacterSet(NSRange aRange)

Discussion

aRange.location is the value of the first character, and aRange.location + aRange.length – 1 is the value of the last. Returns an empty character set if aRange.length is 0.

Creates a character set containing the characters in aString.

public NSCharacterSet(String aString)

Discussion

Returns an empty character set if aString is empty.

Static Methods

alphanumericCharacterSet

Returns a character set containing the characters in the categories Letters, Marks, and Numbers.

public static NSCharacterSet alphanumericCharacterSet()

Discussion

Informally, this set is the set of all characters used as basic units of alphabets, syllabaries, ideographs, and digits.

See Also

capitalizedLetterCharacterSet

Returns a character set containing the characters in the category of Titlecase Letters.

public static NSCharacterSet capitalizedLetterCharacterSet()

Availability
See Also

characterSetWithContentsOfFile

Returns a character set read from the bitmap representation stored in the file at path, which must end with the extension .bitmap.

public static NSCharacterSet characterSetWithContentsOfFile(String path)

Discussion

This method doesn’t use filenames to check for the uniqueness of the character sets it creates. To prevent duplication of character sets in memory, cache them and make them available through an API that checks whether the requested set has already been loaded.

controlCharacterSet

Returns a character set containing the characters in the categories of Control or Format Characters.

public static NSCharacterSet controlCharacterSet()

Discussion

These characters are specifically the Unicode values U+0000 to U+001F and U+007F to U+009F.

See Also

decimalDigitCharacterSet

Returns a character set containing the characters in the category of Decimal Numbers.

public static NSCharacterSet decimalDigitCharacterSet()

Discussion

Informally, this set is the set of all characters used to represent the decimal values 0 through 9. These characters include, for example, the decimal digits of the Indic scripts and Arabic.

See Also

decomposableCharacterSet

Returns a character set containing all individual Unicode characters that can also be represented as composed character sequences (such as for letters with accents), by the definition of “standard decomposition” in version 3.2 of the Unicode character encoding standard.

public static NSCharacterSet decomposableCharacterSet()

Discussion

These characters include compatibility characters as well as precomposed characters.

Note: This character set doesn’t currently include the Hangul characters defined in version 2.0 of the Unicode standard.

See Also

illegalCharacterSet

Returns a character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.

public static NSCharacterSet illegalCharacterSet()

See Also

letterCharacterSet

Returns a character set containing the characters in the categories Letters and Marks.

public static NSCharacterSet letterCharacterSet()

Discussion

Informally, this set is the set of all characters used as letters of alphabets and ideographs.

See Also

lowercaseLetterCharacterSet

Returns a character set containing the characters in the category of Lowercase Letters.

public static NSCharacterSet lowercaseLetterCharacterSet()

Discussion

Informally, this set is the set of all characters used as lowercase letters in alphabets that make case distinctions.

See Also

nonBaseCharacterSet

Returns a character set containing the characters in the category of Marks.

public static NSCharacterSet nonBaseCharacterSet()

Discussion

This set is also defined as all legal Unicode characters with a nonspacing priority greater than 0. Informally, this set is the set of all characters used as modifiers of base characters.

See Also

punctuationCharacterSet

Returns a character set containing the characters in the category of Punctuation.

public static NSCharacterSet punctuationCharacterSet()

Discussion

Informally, this set is the set of all nonwhitespace characters used to separate linguistic units in scripts, such as periods, dashes, parentheses, and so on.

symbolCharacterSet

Returns a character set containing the characters in the category of Symbols.

public static NSCharacterSet symbolCharacterSet()

Discussion

These characters include, for example, the dollar sign ($) and the plus (+) sign.

Availability

uppercaseLetterCharacterSet

Returns a character set containing the characters in the categories of Uppercase Letters and Titlecase Letters.

public static NSCharacterSet uppercaseLetterCharacterSet()

Discussion

Informally, this set is the set of all characters used as uppercase letters in alphabets that make case distinctions.

See Also

whitespaceAndNewlineCharacterSet

Returns a character set containing only the whitespace characters space (U+0020) and tab (U+0009) and the newline and nextline characters (U+000A–U+000D, U+0085).

public static NSCharacterSet whitespaceAndNewlineCharacterSet()

See Also

whitespaceCharacterSet

Returns a character set containing only the in-line whitespace characters space (U+0020) and tab (U+0009).

public static NSCharacterSet whitespaceCharacterSet()

Discussion

This set doesn’t contain the newline or carriage return characters.

See Also

Instance Methods

bitmapRepresentation

Returns an NSData object encoding the receiving character set in binary format.

public NSData bitmapRepresentation()

Discussion

This format is suitable for saving to a file or otherwise transmitting or archiving.

A raw bitmap representation of a character set is a byte array of 2^16 bits (that is, 8192 bytes). The value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n.

characterIsMember

Returns true if aCharacter is in the receiving character set, false if it isn’t.

public boolean characterIsMember(char aCharacter)

characterSetByIntersectingCharacterSet

Returns a character set containing only characters that exist in both the receiver and otherSet.

public NSCharacterSet characterSetByIntersectingCharacterSet(NSCharacterSet otherSet)

characterSetByInvertingCharacterSet

Returns a character set containing only characters that do not exist in the receiver. Inverting an immutable character set is much more efficient than inverting a mutable character set.

public NSCharacterSet characterSetByInvertingCharacterSet()

See Also

characterSetBySubtractingCharacterSet

Returns a character set containing all the characters in the receiver except for those in otherSet.

public NSCharacterSet characterSetBySubtractingCharacterSet(NSCharacterSet otherSet)

characterSetByUnioningCharacterSet

Returns a character set containing all characters that exist in either the receiver or otherSet.

public NSCharacterSet characterSetByUnioningCharacterSet(NSCharacterSet otherSet)

isSupersetOfSet

Returns true if the receiving character set is a superset of theOtherSet, false if it isn’t.

public boolean isSupersetOfSet(NSCharacterSet theOtherSet)

Availability


Next Page > Hide TOC


© 1997, 2006 Apple Computer, Inc. All Rights Reserved. (Last updated: 2006-07-24)


Did this document help you?
Yes: Tell us what works for you.
It’s good, but: Report typos, inaccuracies, and so forth.
It wasn’t helpful: Tell us what would have helped.