Next Page > Hide TOC

Language Analysis Manager Reference

Framework
ApplicationServices/ApplicationServices.h
Declared in
LanguageAnalysis.h

Overview

The Language Analysis Manager application programming interface (API) is a shared library designed to analyze morphemes in text. It is a general-purpose API that does not rely on languages, algorithms of morpheme analysis, or their applications. Language Analysis Manager is not a framework for creating International-aware applications. To make your applications work correctly with various languages, you can use APIs such as Script Manager and Text Utilities.

The Language Analysis Manager (LAM) provides your application with morphological analysis capability, and is designed to work with a language analysis engine. Using the Language Analysis Manager, your application can manage an analysis engine and create environments and contexts in which morpheme analysis can occur. This version of the Language Analysis Manager works only with a Japanese analysis engine.

Functions by Task

Getting The Library Version

Handling Environments

Opening and Closing Contexts

Managing Dictionaries

Analyzing Text

Data Types

HomographAccent

Defines a data type for a homographic accent.

typedef UInt8 HomographAccent;

Availability
Declared In
LanguageAnalysis.h

HomographDicInfoRec

Contains dictionary information for a homograph.

struct HomographDicInfoRec {
   DCMDictionaryID dictionaryID;
   DCMUniqueID uniqueID;
};
typedef struct HomographDicInfoRec HomographDicInfoRec;

Availability
Declared In
LanguageAnalysis.h

HomographWeight

Defines a data type for a homographic weighting value.

typedef UInt16 HomographWeight;

Availability
Declared In
LanguageAnalysis.h

JapanesePartOfSpeech

Defines a data type for a Japanese part of speech.

typedef MorphemePartOfSpeech JapanesePartOfSpeech;

Availability
Declared In
LanguageAnalysis.h

LAContextRef

A reference to an opaque language analysis context.

typedef struct OpaqueLAContextRef * LAContextRef;

Availability
Declared In
LanguageAnalysis.h

LAEnvironmentRef

A reference to an opaque language analysis environment structure.

typedef struct OpaqueLAEnvironmentRef * LAEnvironmentRef;

Availability
Declared In
LanguageAnalysis.h

LAHomograph

Defines a data types for a homograph node.

typedef AERecord LAHomograph;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. A homograph node is the minimum unit of analysis and is representative of an individual language. Typically a homograph node corresponds to one word obtained from the dictionary.

Homograph nodes include the character string which represents this language, but the content varies according to the type of analysis stipulated in the analysis environment. Depending on the type of environment, additional information may be included for a specific language.

Availability
Declared In
LanguageAnalysis.h

LAMorpheme

Defines a data type for a morpheme node.

typedef AERecord LAMorpheme;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. Morpheme nodes display the language of a specific part of speech for a particular text character strings, and have corresponding character string range, part of speech and homograph nodes within text character strings as attributes.

Availability
Declared In
LanguageAnalysis.h

LAMorphemeBundle

Defines a data type for a morpheme bundle.

typedef AERecord LAMorphemeBundle;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. Morpheme bundles are a collection of different solutions to morpheme analysis on one character string. The "different solutions" referred to here means that two solutions have different morpheme delimiters, or the same morpheme delimiters, but the parts of speech are not the same. Morpheme bundles have each of these different solutions in the from of a morpheme path. Morpheme bundles normally have multiple paths in the "most likely" order.

Within morpheme bundles, morpheme paths do not directly include morpheme nodes. Morpheme bundles have a list of morpheme nodes as one of their attributes distinct from the morpheme path, and morpheme paths have an index to that list. In this way, it is possible to share a morpheme node from one or more paths by indirectly indicating the morpheme node. In most cases, multiple paths within one bundle resemble one another to some extent, and multiple paths may be deemed to have the same morpheme node. One morpheme node may include many homograph nodes, making it bigger, so a mechanism such as this which allows sharing is important in maintaining a small data size.

Availability
Declared In
LanguageAnalysis.h

LAMorphemePath

Defines a data type for a morpheme path.

typedef AERecord LAMorphemePath;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. A morpheme path defines a single solution for the analysis of a morpheme. The path has an individual morphme delmiinter and part of speech.

There are two types of variation of morpheme paths which have a different way of holding the lower-place morpheme nodes, and in some cases they are used for different purposes. One is the morpheme path within the morpheme bundle mentioned earlier, where the path does not directly include morpheme nodes.

The other form is the morpheme path which can be used alone, and in this case, it is more convenient for it to be closed in that unit. If an application changes the operation of a morpheme node, the morpheme node must not be being shared. Therefore, for single morpheme paths, morpheme nodes are directly included in the morpheme path.

Availability
Declared In
LanguageAnalysis.h

LAMorphemeRec

Contains results of the analayis of one morpheme.

struct LAMorphemeRec {
   ByteCount sourceTextLength;
   LogicalAddress sourceTextPtr;
   ByteCount morphemeTextLength;
   LogicalAddress morphemeTextPtr;
   UInt32 partOfSpeech;
};
typedef struct LAMorphemeRec LAMorphemeRec;

Fields
sourceTextLength

The length of the source text for this morpheme.

sourceTextPtr

A pointer to the source text.

morphemeTextLength

The length of the result text for this morpheme.

morphemeTextPtr

A pointer to the result text.

partOfSpeech

The part of speech of this morpheme.

Discussion

This structure is an entry in the LAMorphemesArray data structure.

Availability
Declared In
LanguageAnalysis.h

LAMorphemesArray

Contains the results of high-level morphological analysis.

struct LAMorphemesArray {
   ItemCount morphemesCount;
   ByteCount processedTextLength;
   ByteCount morphemesTextLength;
   LAMorphemeRec morphemes[1];
};
typedef struct LAMorphemesArray LAMorphemesArray;
typedef LAMorphemesArray * LAMorphemesArrayPtr;

Fields
morphemesCount

The number of morphemes included.

processedTextLength

The processed source character length.

morphemesTextLength

The overall length of the result string.

morphemes

An array of morpheme records.

Discussion

When you perform high-level analysis, you can analyze stream-format text and obtain the results as an array of morpeme information.

Availability
Declared In
LanguageAnalysis.h

LAPropertyKey

Defines a data type for a language analysis property key.

typedef AEKeyword LAPropertyKey;

Availability
Declared In
LanguageAnalysis.h

LAPropertyType

Defines a data type for a language analysis property type.

typedef DescType LAPropertyType;

Availability
Declared In
LanguageAnalysis.h

MorphemePartOfSpeech

Defines a data type for a morpheme part of speech.

typedef UInt32 MorphemePartOfSpeech;

Availability
Declared In
LanguageAnalysis.h

MorphemeTextRange

Contains a range of text associated with a morpheme.

struct MorphemeTextRange {
   UInt32 sourceOffset;
   UInt32 length;
};
typedef struct MorphemeTextRange MorphemeTextRange;

Availability
Declared In
LanguageAnalysis.h

Constants

File Creator Constants

Specify file creator for dictionary of Apple Japanese access methods.

enum {
   kAppleJapaneseDictionarySignature = 'jlan'
};

Analysis Engine Keywords

Specify analysis engine keywords for morpheme/homograph information.

enum {
   keyAEHomographDicInfo = 'lahd',
   keyAEHomographWeight = 'lahw',
   keyAEHomographAccent = 'laha'
};

Analysis Results Constants

Specify the nodes associated with analysis resutls.

enum {
   keyAELAMorphemeBundle = 'lmfb',
   keyAELAMorphemePath = 'lmfp',
   keyAELAMorpheme = 'lmfn',
   keyAELAHomograph = 'lmfh'
};

Morpheme Key Values

Specify key values used for morpheme/homgraph information.

enum {
   keyAEMorphemePartOfSpeechCode = 'lamc',
   keyAEMorphemeTextRange = 'lamt'
};

All Morphemes Constant

Specifies to use all morphemes.

enum {
   kLAAllMorphemes = 0
};

Leading and Trailing Constants

Specify constraints to apply to a string.

enum {
   kLADefaultEdge = 0,
   kLAFreeEdge = 1,
   kLAIncompleteEdge = 2
};

Converting Mask

Defines a mask for high-level API conversion flags.

enum {
   kLAEndOfSourceTextMask = 0x00000001
};

Morphemes Array Version

Specifies the version of the array used to hold morpheme analysis results.

enum {
   kLAMorphemesArrayVersion = 0
};

Conjugation Constants

Specify Japanese conjugations.

enum {
   kLASpeechKatsuyouGokan = 0x00000001,
   kLASpeechKatsuyouMizen = 0x00000002,
   kLASpeechKatsuyouRenyou = 0x00000003,
   kLASpeechKatsuyouSyuushi = 0x00000004,
   kLASpeechKatsuyouRentai = 0x00000005,
   kLASpeechKatsuyouKatei = 0x00000006,
   kLASpeechKatsuyouMeirei = 0x00000007
};

Parts of Speech Constants

Specify Japanese parts of speech.

enum {
   kLASpeechMeishi = 0x00000000,
   kLASpeechFutsuuMeishi = 0x00000000,
   kLASpeechJinmei = 0x00000100,
   kLASpeechJinmeiSei = 0x00000110,
   kLASpeechJinmeiMei = 0x00000120,
   kLASpeechChimei = 0x00000200,
   kLASpeechSetsubiChimei = 0x00000210,
   kLASpeechSoshikimei = 0x00000300,
   kLASpeechKoyuuMeishi = 0x00000400,
   kLASpeechSahenMeishi = 0x00000500,
   kLASpeechKeidouMeishi = 0x00000600,
   kLASpeechRentaishi = 0x00001000,
   kLASpeechFukushi = 0x00002000,
   kLASpeechSetsuzokushi = 0x00003000,
   kLASpeechKandoushi = 0x00004000,
   kLASpeechDoushi = 0x00005000,
   kLASpeechGodanDoushi = 0x00005000,
   kLASpeechKagyouGodan = 0x00005000,
   kLASpeechSagyouGodan = 0x00005010,
   kLASpeechTagyouGodan = 0x00005020,
   kLASpeechNagyouGodan = 0x00005030,
   kLASpeechMagyouGodan = 0x00005040,
   kLASpeechRagyouGodan = 0x00005050,
   kLASpeechWagyouGodan = 0x00005060,
   kLASpeechGagyouGodan = 0x00005070,
   kLASpeechBagyouGodan = 0x00005080,
   kLASpeechIchidanDoushi = 0x00005100,
   kLASpeechKahenDoushi = 0x00005200,
   kLASpeechSahenDoushi = 0x00005300,
   kLASpeechZahenDoushi = 0x00005400,
   kLASpeechKeiyoushi = 0x00006000,
   kLASpeechKeiyoudoushi = 0x00007000,
   kLASpeechSettougo = 0x00008000,
   kLASpeechSuujiSettougo = 0x00008100,
   kLASpeechSetsubigo = 0x00009000,
   kLASpeechJinmeiSetsubigo = 0x00009100,
   kLASpeechChimeiSetsubigo = 0x00009200,
   kLASpeechSoshikimeiSetsubigo = 0x00009300,
   kLASpeechSuujiSetsubigo = 0x00009400,
   kLASpeechMuhinshi = 0x0000A000,
   kLASpeechTankanji = 0x0000A000,
   kLASpeechKigou = 0x0000A100,
   kLASpeechKuten = 0x0000A110,
   kLASpeechTouten = 0x0000A120,
   kLASpeechSuushi = 0x0000A200,
   kLASpeechDokuritsugo = 0x0000A300,
   kLASpeechSeiku = 0x0000A400,
   kLASpeechJodoushi = 0x0000B000,
   kLASpeechJoshi = 0x0000C000
};

Parts of Speech Masks

Specify masks for parts of speech.

enum {
   kLASpeechRoughClassMask = 0x0000F000,
   kLASpeechMediumClassMask = 0x0000FF00,
   kLASpeechStrictClassMask = 0x0000FFF0,
   kLASpeechKatsuyouMask = 0x0000000F
};

Engine Limitations

Specify language analysis engine limitations.

enum {
   kMaxInputLengthOfAppleJapaneseEngine = 200
};

Analysis Engine Type Definitions

Specify language analysis engine type definitions for morpheme/homograph information.

enum {
   typeAEHomographDicInfo = 'lahd',
   typeAEHomographWeight = typeShortInteger,
   typeAEHomographAccent = 'laha'
};

Morpheme Types

Specify data types for morphemes.

enum {
   typeAEMorphemePartOfSpeechCode = 'lamc',
   typeAEMorphemeTextRange = 'lamt'
};

Morpheme Type Analysis Constants

Specify types used in morphological analysis.

enum { typeLAMorphemeBundle = typeAERecord,
   typeLAMorphemePath = typeAERecord,
   typeLAMorpheme = typeAEList,
   typeLAHomograph = typeAEList };

Default Environment Names

Specify names for default environments for Japanese analysis.

#define kLAJapaneseKanaKanjiEnvironment ";\pKanaKanjiConversion"
#define kLAJapaneseMorphemeAnalysisEnvironment ";\pJapaneseMorphemeAnalysis"
#define kLAJapaneseTTSEnvironment ";\pJapaneseTextToSpeech"

Result Codes

The most common result codes retuned by the Language Analysis Manager are listed in the table below.

Result CodeValueDescription
laTooSmallBufferErr -6984

The output buffer is too small to store any result.

Available in Mac OS X v10.0 and later.

laEnvironmentBusyErr -6985

The specified environment is used.

Available in Mac OS X v10.0 and later.

laEnvironmentNotFoundErr -6986

The specified environment can’t be found.

Available in Mac OS X v10.0 and later.

laEnvironmentExistErr -6987

An environment by the same name already exists.

Available in Mac OS X v10.0 and later.

laInvalidPathErr -6988

The path is not correct.

Available in Mac OS X v10.0 and later.

laNoMoreMorphemeErr -6989

There is nothing to read.

Available in Mac OS X v10.0 and later.

laFailAnalysisErr -6990

The analysis failed.

Available in Mac OS X v10.0 and later.

laTextOverFlowErr -6991

The text is too long.

Available in Mac OS X v10.0 and later.

laDictionaryNotOpenedErr -6992

The dictionary is not opened.

Available in Mac OS X v10.0 and later.

laDictionaryUnknownErr -6993

This dictionary can’t be used with this environment.

Available in Mac OS X v10.0 and later.

laDictionaryTooManyErr -6994

There are too many dictionaries.

Available in Mac OS X v10.0 and later.

laPropertyValueErr -6995

Invalid property value.

Available in Mac OS X v10.0 and later.

laPropertyUnknownErr -6996

The property is unknown to this environment.

Available in Mac OS X v10.0 and later.

laPropertyIsReadOnlyErr -6997

The property is read only.

Available in Mac OS X v10.0 and later.

laPropertyNotFoundErr -6998

The property can’t be found.

Available in Mac OS X v10.0 and later.

laPropertyErr -6999

There is an error in the property.

Available in Mac OS X v10.0 and later.

laEngineNotFoundErr -7000

The engine can’t be found.

Available in Mac OS X v10.0 and later.



Next Page > Hide TOC


© 2003 Apple Computer, Inc. All Rights Reserved. (Last updated: 2003-04-01)


Did this document help you?
Yes: Tell us what works for you.
It’s good, but: Report typos, inaccuracies, and so forth.
It wasn’t helpful: Tell us what would have helped.