Important: The information in this document is obsolete and should not be used for new development.
Inherits from | |
Implements | |
Package | com.apple.cocoa.foundation |
Companion guide |
An NSCharacterSet object represents a set of Unicode 3.2 compliant characters. String and NSScanner objects use NSCharacterSets to group characters together for searching operations, so that they can find any of a particular set of characters during a search. The two classes, NSCharacterSet and NSMutableCharacterSet, declare the programmatic interface for static and dynamic character sets, respectively.
The objects you create using these classes are referred to as character set objects (and when no confusion will result, merely as character sets).
The NSCharacterSet class declares the programmatic interface for an object that manages a set of Unicode characters (see the NSStringReference class specification for information on Unicode). NSCharacterSet’s principal primitive method, characterIsMember
, provides the basis for all other instance methods in its interface. A subclass of NSCharacterSet needs only to implement this method for proper behavior. For optimal performance, a subclass should also override bitmapRepresentation
, which otherwise works by invoking characterIsMember
for every possible Unicode value.
The mutable subclass of NSCharacterSet is NSMutableCharacterSet.
alphanumericCharacterSet
capitalizedLetterCharacterSet
controlCharacterSet
decimalDigitCharacterSet
decomposableCharacterSet
illegalCharacterSet
letterCharacterSet
lowercaseLetterCharacterSet
nonBaseCharacterSet
punctuationCharacterSet
symbolCharacterSet
uppercaseLetterCharacterSet
whitespaceAndNewlineCharacterSet
whitespaceCharacterSet
characterSetByIntersectingCharacterSet
characterSetByInvertingCharacterSet
characterSetBySubtractingCharacterSet
characterSetByUnioningCharacterSet
Creates an empty NSCharacterSet.
public NSCharacterSet
()
Creates a character set containing characters determined by the bitmap representation aData.
public NSCharacterSet
(NSData aData)
This capability is useful for creating a character set object with data from a file or other external data source.
Creates a character set containing characters whose Unicode values are given by aRange.
public NSCharacterSet
(NSRange aRange)
aRange.location is the value of the first character, and aRange.location + aRange.length – 1 is the value of the last. Returns an empty character set if aRange.length is 0.
Creates a character set containing the characters in aString.
public NSCharacterSet
(String aString)
Returns an empty character set if aString is empty.
Returns a character set containing the characters in the categories Letters, Marks, and Numbers.
public static NSCharacterSet alphanumericCharacterSet
()
Informally, this set is the set of all characters used as basic units of alphabets, syllabaries, ideographs, and digits.
Returns a character set containing the characters in the category of Titlecase Letters.
public static NSCharacterSet capitalizedLetterCharacterSet
()
Returns a character set read from the bitmap representation stored in the file at path, which must end with the extension .bitmap
.
public static NSCharacterSet characterSetWithContentsOfFile
(String path)
This method doesn’t use filenames to check for the uniqueness of the character sets it creates. To prevent duplication of character sets in memory, cache them and make them available through an API that checks whether the requested set has already been loaded.
Returns a character set containing the characters in the categories of Control or Format Characters.
public static NSCharacterSet controlCharacterSet
()
These characters are specifically the Unicode values U+0000 to U+001F and U+007F to U+009F.
Returns a character set containing the characters in the category of Decimal Numbers.
public static NSCharacterSet decimalDigitCharacterSet
()
Informally, this set is the set of all characters used to represent the decimal values 0 through 9. These characters include, for example, the decimal digits of the Indic scripts and Arabic.
Returns a character set containing all individual Unicode characters that can also be represented as composed character sequences (such as for letters with accents), by the definition of “standard decomposition” in version 3.2 of the Unicode character encoding standard.
public static NSCharacterSet decomposableCharacterSet
()
These characters include compatibility characters as well as precomposed characters.
Note: This character set doesn’t currently include the Hangul characters defined in version 2.0 of the Unicode standard.
Returns a character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.
public static NSCharacterSet illegalCharacterSet
()
Returns a character set containing the characters in the categories Letters and Marks.
public static NSCharacterSet letterCharacterSet
()
Informally, this set is the set of all characters used as letters of alphabets and ideographs.
Returns a character set containing the characters in the category of Lowercase Letters.
public static NSCharacterSet lowercaseLetterCharacterSet
()
Informally, this set is the set of all characters used as lowercase letters in alphabets that make case distinctions.
Returns a character set containing the characters in the category of Marks.
public static NSCharacterSet nonBaseCharacterSet
()
This set is also defined as all legal Unicode characters with a nonspacing priority greater than 0. Informally, this set is the set of all characters used as modifiers of base characters.
Returns a character set containing the characters in the category of Punctuation.
public static NSCharacterSet punctuationCharacterSet
()
Informally, this set is the set of all nonwhitespace characters used to separate linguistic units in scripts, such as periods, dashes, parentheses, and so on.
Returns a character set containing the characters in the category of Symbols.
public static NSCharacterSet symbolCharacterSet
()
These characters include, for example, the dollar sign ($) and the plus (+) sign.
Returns a character set containing the characters in the categories of Uppercase Letters and Titlecase Letters.
public static NSCharacterSet uppercaseLetterCharacterSet
()
Informally, this set is the set of all characters used as uppercase letters in alphabets that make case distinctions.
Returns a character set containing only the whitespace characters space (U+0020) and tab (U+0009) and the newline and nextline characters (U+000A–U+000D, U+0085).
public static NSCharacterSet whitespaceAndNewlineCharacterSet
()
Returns a character set containing only the in-line whitespace characters space (U+0020) and tab (U+0009).
public static NSCharacterSet whitespaceCharacterSet
()
This set doesn’t contain the newline or carriage return characters.
Returns an NSData object encoding the receiving character set in binary format.
public NSData bitmapRepresentation
()
This format is suitable for saving to a file or otherwise transmitting or archiving.
A raw bitmap representation of a character set is a byte array of 2^16 bits (that is, 8192 bytes). The value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n.
Returns true
if aCharacter is in the receiving character set, false
if it isn’t.
public boolean characterIsMember
(char aCharacter)
Returns a character set containing only characters that exist in both the receiver and otherSet.
public NSCharacterSet characterSetByIntersectingCharacterSet
(NSCharacterSet otherSet)
Returns a character set containing only characters that do not exist in the receiver. Inverting an immutable character set is much more efficient than inverting a mutable character set.
public NSCharacterSet characterSetByInvertingCharacterSet
()
invertCharacterSet
(NSMutableCharacterSet)Returns a character set containing all the characters in the receiver except for those in otherSet.
public NSCharacterSet characterSetBySubtractingCharacterSet
(NSCharacterSet otherSet)
Returns a character set containing all characters that exist in either the receiver or otherSet.
public NSCharacterSet characterSetByUnioningCharacterSet
(NSCharacterSet otherSet)
Returns true
if the receiving character set is a superset of theOtherSet, false
if it isn’t.
public boolean isSupersetOfSet
(NSCharacterSet theOtherSet)
© 1997, 2006 Apple Computer, Inc. All Rights Reserved. (Last updated: 2006-07-24)