Next Page >

Hide TOC

Language Analysis Manager Reference

Framework	ApplicationServices/ApplicationServices.h
Declared in	LanguageAnalysis.h

Overview

The Language Analysis Manager application programming interface (API) is a shared library designed to analyze morphemes in text. It is a general-purpose API that does not rely on languages, algorithms of morpheme analysis, or their applications. Language Analysis Manager is not a framework for creating International-aware applications. To make your applications work correctly with various languages, you can use APIs such as Script Manager and Text Utilities.

The Language Analysis Manager (LAM) provides your application with morphological analysis capability, and is designed to work with a language analysis engine. Using the Language Analysis Manager, your application can manage an analysis engine and create environments and contexts in which morpheme analysis can occur. This version of the Language Analysis Manager works only with a Japanese analysis engine.

Functions by Task

Getting The Library Version

Handling Environments

Opening and Closing Contexts

Managing Dictionaries

Analyzing Text

Data Types

HomographAccent

Defines a data type for a homographic accent.

typedef UInt8 HomographAccent;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

HomographDicInfoRec

Contains dictionary information for a homograph.

struct HomographDicInfoRec {
   DCMDictionaryID dictionaryID;
   DCMUniqueID uniqueID;
};
typedef struct HomographDicInfoRec HomographDicInfoRec;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

HomographWeight

Defines a data type for a homographic weighting value.

typedef UInt16 HomographWeight;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

JapanesePartOfSpeech

Defines a data type for a Japanese part of speech.

typedef MorphemePartOfSpeech JapanesePartOfSpeech;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAContextRef

A reference to an opaque language analysis context.

typedef struct OpaqueLAContextRef * LAContextRef;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAEnvironmentRef

A reference to an opaque language analysis environment structure.

typedef struct OpaqueLAEnvironmentRef * LAEnvironmentRef;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAHomograph

Defines a data types for a homograph node.

typedef AERecord LAHomograph;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. A homograph node is the minimum unit of analysis and is representative of an individual language. Typically a homograph node corresponds to one word obtained from the dictionary.

Homograph nodes include the character string which represents this language, but the content varies according to the type of analysis stipulated in the analysis environment. Depending on the type of environment, additional information may be included for a specific language.

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAMorpheme

Defines a data type for a morpheme node.

typedef AERecord LAMorpheme;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. Morpheme nodes display the language of a specific part of speech for a particular text character strings, and have corresponding character string range, part of speech and homograph nodes within text character strings as attributes.

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAMorphemeBundle

Defines a data type for a morpheme bundle.

typedef AERecord LAMorphemeBundle;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. Morpheme bundles are a collection of different solutions to morpheme analysis on one character string. The "different solutions" referred to here means that two solutions have different morpheme delimiters, or the same morpheme delimiters, but the parts of speech are not the same. Morpheme bundles have each of these different solutions in the from of a morpheme path. Morpheme bundles normally have multiple paths in the "most likely" order.

Within morpheme bundles, morpheme paths do not directly include morpheme nodes. Morpheme bundles have a list of morpheme nodes as one of their attributes distinct from the morpheme path, and morpheme paths have an index to that list. In this way, it is possible to share a morpheme node from one or more paths by indirectly indicating the morpheme node. In most cases, multiple paths within one bundle resemble one another to some extent, and multiple paths may be deemed to have the same morpheme node. One morpheme node may include many homograph nodes, making it bigger, so a mechanism such as this which allows sharing is important in maintaining a small data size.

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAMorphemePath

Defines a data type for a morpheme path.

typedef AERecord LAMorphemePath;

Discussion

The Apple event record (AERecord) is the data type upon which many Language Analysis Manager data types are based. A morpheme path defines a single solution for the analysis of a morpheme. The path has an individual morphme delmiinter and part of speech.

There are two types of variation of morpheme paths which have a different way of holding the lower-place morpheme nodes, and in some cases they are used for different purposes. One is the morpheme path within the morpheme bundle mentioned earlier, where the path does not directly include morpheme nodes.

The other form is the morpheme path which can be used alone, and in this case, it is more convenient for it to be closed in that unit. If an application changes the operation of a morpheme node, the morpheme node must not be being shared. Therefore, for single morpheme paths, morpheme nodes are directly included in the morpheme path.

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAMorphemeRec

Contains results of the analayis of one morpheme.

struct LAMorphemeRec {
   ByteCount sourceTextLength;
   LogicalAddress sourceTextPtr;
   ByteCount morphemeTextLength;
   LogicalAddress morphemeTextPtr;
   UInt32 partOfSpeech;
};
typedef struct LAMorphemeRec LAMorphemeRec;

Fields

sourceTextLength: The length of the source text for this morpheme.
sourceTextPtr: A pointer to the source text.
morphemeTextLength: The length of the result text for this morpheme.
morphemeTextPtr: A pointer to the result text.
partOfSpeech: The part of speech of this morpheme.

Discussion

This structure is an entry in the LAMorphemesArray data structure.

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAMorphemesArray

Contains the results of high-level morphological analysis.

struct LAMorphemesArray {
   ItemCount morphemesCount;
   ByteCount processedTextLength;
   ByteCount morphemesTextLength;
   LAMorphemeRec morphemes[1];
};
typedef struct LAMorphemesArray LAMorphemesArray;
typedef LAMorphemesArray * LAMorphemesArrayPtr;

Fields

morphemesCount: The number of morphemes included.
processedTextLength: The processed source character length.
morphemesTextLength: The overall length of the result string.
morphemes: An array of morpheme records.

Discussion

When you perform high-level analysis, you can analyze stream-format text and obtain the results as an array of morpeme information.

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAPropertyKey

Defines a data type for a language analysis property key.

typedef AEKeyword LAPropertyKey;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

LAPropertyType

Defines a data type for a language analysis property type.

typedef DescType LAPropertyType;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

MorphemePartOfSpeech

Defines a data type for a morpheme part of speech.

typedef UInt32 MorphemePartOfSpeech;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

MorphemeTextRange

Contains a range of text associated with a morpheme.

struct MorphemeTextRange {
   UInt32 sourceOffset;
   UInt32 length;
};
typedef struct MorphemeTextRange MorphemeTextRange;

Availability

Available in Mac OS X v10.0 and later.
Not available to 64-bit applications.

Declared In

LanguageAnalysis.h

Constants

File Creator Constants

Specify file creator for dictionary of Apple Japanese access methods.

enum {
   kAppleJapaneseDictionarySignature = 'jlan'
};

Analysis Engine Keywords

Specify analysis engine keywords for morpheme/homograph information.

enum {
   keyAEHomographDicInfo = 'lahd',
   keyAEHomographWeight = 'lahw',
   keyAEHomographAccent = 'laha'
};

Analysis Results Constants

Specify the nodes associated with analysis resutls.

enum {
   keyAELAMorphemeBundle = 'lmfb',
   keyAELAMorphemePath = 'lmfp',
   keyAELAMorpheme = 'lmfn',
   keyAELAHomograph = 'lmfh'
};

Morpheme Key Values

Specify key values used for morpheme/homgraph information.

enum {
   keyAEMorphemePartOfSpeechCode = 'lamc',
   keyAEMorphemeTextRange = 'lamt'
};

All Morphemes Constant

Specifies to use all morphemes.

enum {
   kLAAllMorphemes = 0
};

Leading and Trailing Constants

Specify constraints to apply to a string.

enum {
   kLADefaultEdge = 0,
   kLAFreeEdge = 1,
   kLAIncompleteEdge = 2
};

Converting Mask

Defines a mask for high-level API conversion flags.

enum {
   kLAEndOfSourceTextMask = 0x00000001
};

Morphemes Array Version

Specifies the version of the array used to hold morpheme analysis results.

enum {
   kLAMorphemesArrayVersion = 0
};

Conjugation Constants

Specify Japanese conjugations.

enum {
   kLASpeechKatsuyouGokan = 0x00000001,
   kLASpeechKatsuyouMizen = 0x00000002,
   kLASpeechKatsuyouRenyou = 0x00000003,
   kLASpeechKatsuyouSyuushi = 0x00000004,
   kLASpeechKatsuyouRentai = 0x00000005,
   kLASpeechKatsuyouKatei = 0x00000006,
   kLASpeechKatsuyouMeirei = 0x00000007
};

Parts of Speech Constants

Specify Japanese parts of speech.

enum {
   kLASpeechMeishi = 0x00000000,
   kLASpeechFutsuuMeishi = 0x00000000,
   kLASpeechJinmei = 0x00000100,
   kLASpeechJinmeiSei = 0x00000110,
   kLASpeechJinmeiMei = 0x00000120,
   kLASpeechChimei = 0x00000200,
   kLASpeechSetsubiChimei = 0x00000210,
   kLASpeechSoshikimei = 0x00000300,
   kLASpeechKoyuuMeishi = 0x00000400,
   kLASpeechSahenMeishi = 0x00000500,
   kLASpeechKeidouMeishi = 0x00000600,
   kLASpeechRentaishi = 0x00001000,
   kLASpeechFukushi = 0x00002000,
   kLASpeechSetsuzokushi = 0x00003000,
   kLASpeechKandoushi = 0x00004000,
   kLASpeechDoushi = 0x00005000,
   kLASpeechGodanDoushi = 0x00005000,
   kLASpeechKagyouGodan = 0x00005000,
   kLASpeechSagyouGodan = 0x00005010,
   kLASpeechTagyouGodan = 0x00005020,
   kLASpeechNagyouGodan = 0x00005030,
   kLASpeechMagyouGodan = 0x00005040,
   kLASpeechRagyouGodan = 0x00005050,
   kLASpeechWagyouGodan = 0x00005060,
   kLASpeechGagyouGodan = 0x00005070,
   kLASpeechBagyouGodan = 0x00005080,
   kLASpeechIchidanDoushi = 0x00005100,
   kLASpeechKahenDoushi = 0x00005200,
   kLASpeechSahenDoushi = 0x00005300,
   kLASpeechZahenDoushi = 0x00005400,
   kLASpeechKeiyoushi = 0x00006000,
   kLASpeechKeiyoudoushi = 0x00007000,
   kLASpeechSettougo = 0x00008000,
   kLASpeechSuujiSettougo = 0x00008100,
   kLASpeechSetsubigo = 0x00009000,
   kLASpeechJinmeiSetsubigo = 0x00009100,
   kLASpeechChimeiSetsubigo = 0x00009200,
   kLASpeechSoshikimeiSetsubigo = 0x00009300,
   kLASpeechSuujiSetsubigo = 0x00009400,
   kLASpeechMuhinshi = 0x0000A000,
   kLASpeechTankanji = 0x0000A000,
   kLASpeechKigou = 0x0000A100,
   kLASpeechKuten = 0x0000A110,
   kLASpeechTouten = 0x0000A120,
   kLASpeechSuushi = 0x0000A200,
   kLASpeechDokuritsugo = 0x0000A300,
   kLASpeechSeiku = 0x0000A400,
   kLASpeechJodoushi = 0x0000B000,
   kLASpeechJoshi = 0x0000C000
};

Parts of Speech Masks

Specify masks for parts of speech.

enum {
   kLASpeechRoughClassMask = 0x0000F000,
   kLASpeechMediumClassMask = 0x0000FF00,
   kLASpeechStrictClassMask = 0x0000FFF0,
   kLASpeechKatsuyouMask = 0x0000000F
};

Engine Limitations

Specify language analysis engine limitations.

enum {
   kMaxInputLengthOfAppleJapaneseEngine = 200
};

Analysis Engine Type Definitions

Specify language analysis engine type definitions for morpheme/homograph information.

enum {
   typeAEHomographDicInfo = 'lahd',
   typeAEHomographWeight = typeShortInteger,
   typeAEHomographAccent = 'laha'
};

Morpheme Types

Specify data types for morphemes.

enum {
   typeAEMorphemePartOfSpeechCode = 'lamc',
   typeAEMorphemeTextRange = 'lamt'
};

Morpheme Type Analysis Constants

Specify types used in morphological analysis.

enum { typeLAMorphemeBundle = typeAERecord,
   typeLAMorphemePath = typeAERecord,
   typeLAMorpheme = typeAEList,
   typeLAHomograph = typeAEList };

Default Environment Names

Specify names for default environments for Japanese analysis.

#define kLAJapaneseKanaKanjiEnvironment ";\pKanaKanjiConversion"
#define kLAJapaneseMorphemeAnalysisEnvironment ";\pJapaneseMorphemeAnalysis"
#define kLAJapaneseTTSEnvironment ";\pJapaneseTextToSpeech"

Result Codes

The most common result codes retuned by the Language Analysis Manager are listed in the table below.

Result Code	Value	Description
`laTooSmallBufferErr`	-6984	The output buffer is too small to store any result. Available in Mac OS X v10.0 and later.
`laEnvironmentBusyErr`	-6985	The specified environment is used. Available in Mac OS X v10.0 and later.
`laEnvironmentNotFoundErr`	-6986	The specified environment can’t be found. Available in Mac OS X v10.0 and later.
`laEnvironmentExistErr`	-6987	An environment by the same name already exists. Available in Mac OS X v10.0 and later.
`laInvalidPathErr`	-6988	The path is not correct. Available in Mac OS X v10.0 and later.
`laNoMoreMorphemeErr`	-6989	There is nothing to read. Available in Mac OS X v10.0 and later.
`laFailAnalysisErr`	-6990	The analysis failed. Available in Mac OS X v10.0 and later.
`laTextOverFlowErr`	-6991	The text is too long. Available in Mac OS X v10.0 and later.
`laDictionaryNotOpenedErr`	-6992	The dictionary is not opened. Available in Mac OS X v10.0 and later.
`laDictionaryUnknownErr`	-6993	This dictionary can’t be used with this environment. Available in Mac OS X v10.0 and later.
`laDictionaryTooManyErr`	-6994	There are too many dictionaries. Available in Mac OS X v10.0 and later.
`laPropertyValueErr`	-6995	Invalid property value. Available in Mac OS X v10.0 and later.
`laPropertyUnknownErr`	-6996	The property is unknown to this environment. Available in Mac OS X v10.0 and later.
`laPropertyIsReadOnlyErr`	-6997	The property is read only. Available in Mac OS X v10.0 and later.
`laPropertyNotFoundErr`	-6998	The property can’t be found. Available in Mac OS X v10.0 and later.
`laPropertyErr`	-6999	There is an error in the property. Available in Mac OS X v10.0 and later.
`laEngineNotFoundErr`	-7000	The engine can’t be found. Available in Mac OS X v10.0 and later.

Next Page >

Hide TOC

Did this document help you?
Yes: Tell us what works for you. It’s good, but: Report typos, inaccuracies, and so forth. It wasn’t helpful: Tell us what would have helped.