dom(n) dom(n)
____________________________________________________________________________________________________________
NAME
dom - Create an in-memory DOM tree from XML
SYNOPSIS
package require tdom
dom method ?arg arg ...?
____________________________________________________________________________________________________________
DESCRIPTION
This command provides the creation of complete DOM trees in memory. In the usual case a string con-taining containing
taining a XML information is parsed and converted into a DOM tree. method indicates a specific sub-command. subcommand.
command.
The valid methods are:
dom parse ?options? ?data?
Parses the XML information and builds up the DOM tree in memory providing a Tcl object command
to this DOM document object. Example:
dom parse $xml doc
$doc documentElement root
parses the XML in the variable xml, creates the DOM tree in memory, make a reference to the
document object, visible in Tcl as a document object command, and assigns this new object name
to the variable doc. When doc gets freed, the DOM tree and the associated Tcl command object
(document and all node objects) are freed automatically.
set document [dom parse $xml]
set root [$document documentElement]
parses the XML in the variable xml, creates the DOM tree in memory, make a reference to the
document object, visible in Tcl as a document object command, and returns this new object
name, which is then stored in document. To free the underlying DOM tree and the associative
Tcl object commands (document + nodes + fragment nodes) the document object command has to be
explicitly deleted by:
$document delete
or
rename $document ""
The valid options are:
-simple
If -simple is specified, a simple but fast parser is used (conforms not fully to XML
recommendation). That should double parsing and DOM generation speed. The encoding of
the data is not transformed inside the parser. The simple parser does not respect any
encoding information in the XML declaration. It skips over the internal DTD subset and
ignores any information in it. Therefor it doesn't include defaulted attribute values
into the tree, even if the according attribute declaration is in the internal subset.
It also doesn't expand internal or external entity references other than the predefined
entities and character references.
-html If -html is specified, a fast HTML parser is used, which tries to even parse badly
formed HTML into a DOM tree.
-keepEmpties
If -keepEmpties is specified, text nodes, which contain only whitespaces, will be part
of the resulting DOM tree. In default case (-keepEmpties not given) those empty text
nodes are removed at parsing time.
-channel <channel-ID>
If -channel <channel-ID> is specified, the input to be parsed is read from the speci-fied specified
fied channel. The encoding setting of the channel (via fconfigure -encoding) is
respected, ie the data read from the channel are converted to UTF-8 according to the
encoding settings, befor the data is parsed.
-baseurl <baseURI>
If -baseurl <baseURI> is specified, the baseURI is used as the base URI of the docu-ment. document.
ment. External entities referenced in the document are resolved relative to this base
URI. This base URI is also stored within the DOM tree.
-feedbackAfter <#bytes>
If -feedbackAfter <#bytes> is specified, the tcl command ::dom::domParseFeedback is
evaluated after parsing every #bytes. If you use this option, you have to create a tcl
proc named ::dom::domParseFeedback, otherwise you will get an error. Please notice,
that the calls of ::dom::domParseFeedback are not done exactly every #bytes, but always
at the first element start after every #bytes.
-externalentitycommand <script>
If -externalentitycommand <script> is specified, the specified tcl script is called to
resolve any external entities of the document. The actual evaluated command consists of
this option followed by three arguments: the base uri, the system identifier of the
entity and the public identifier of the entity. The base uri and the public identifier
may be the empty list. The script has to return a tcl list consisting of three ele-ments. elements.
ments. The first element of this list signals, how the external entity is returned to
the processor. At the moment, the two allowed types are "string" and "channel". The
second element of the list has to be the (absolute) base URI of the external entity to
be parsed. The third element of the list are data, either the already read data out of
the external entity as string in the case of type "string", or the name of a tcl chan-nel, channel,
nel, in the case of type "channel".
-useForeignDTD <boolean>
If <boolean> is true and the document does not have an external subset, the parser will
call the -externalentitycommand script with empty values for the systemId and publicID
arguments. Pleace notice, that, if the document also doesn't have an internal subset,
the -startdoctypedeclcommand and -enddoctypedeclcommand scripts, if set, are not
called. The -useForeignDTD respects
-paramentityparsing <always|never|notstandalone>
The -paramentityparsing option controls, if the parser tries to resolve the external
entities (including the external DTD subset) of the document, while building the DOM
tree. -paramentityparsing requires an argument, which must be either "always", "never",
or "notstandalone". The value "always" means, that the parser tries to resolves (recur-sively) (recursively)
sively) all external entities of the XML source. This is the default, in case -paramen-tityparsing -paramentityparsing
tityparsing is omitted. The value "never" means, that only the given XML source is
parsed and no external entity (including the external subset) will be resolved and
parsed. The value "notstandalone" means, that all external entities will be resolved
and parsed, with the execption of documents, which explicitly states standalone="yes"
in their XML declaration.
dom createDocument docElemName ?objVar?
Creates a new DOM document object with one element node with node name docElemName. The objVar
controlls the memory handling as explained above.
dom createDocumentNS uri docElemName ?objVar?
Creates a new DOM document object with one element node with node name docElemName. Uri gives
the namespace of the document element to create. The objVar controlls the memory handling as
explained above.
dom setResultEncoding ?encodingName?
If encodingName is not given the current global result encoding is returned. Otherwise the
global result encoding is set to encodingName. All character data, attribute values, etc.
will then be converted from UTF-8, which is delivered from the Expat XML parser, to the given
8 bit encoding at XML/DOM parse time. Valid values for encodingName are: utf-8, ascii,
cp1250, cp1251, cp1252, cp1253, cp1254, cp1255, cp1256, cp437, cp850, en, iso8859-1,
iso8859-2, iso8859-3, iso8859-4, iso8859-5, iso8859-6, iso8859-7, iso8859-8, iso8859-9,
koi8-r.
dom createNodeCmd ?-returnNodeCmd? (element|comment|text|cdata|pi)Node commandName
This method creates Tcl commands, which in turn create tDOM nodes. Tcl commands created by
this command are only avaliable inside a script given to the domNode method appendFromScript.
If a command created with createNodeCmd is invoked in any other context, it will return error.
The created command commandName replaces any existing command or procedure with that name. If
the commandName includes any namespace qualifiers, it is created in the specified namespace.
If such command is invoked inside a script given as argument to the domNode method appendFrom-Script, appendFromScript,
Script, it creates a new node and appends this node at the end of the child list of the invok-ing invoking
ing element node. If the option -returnNodeCmd was given, the command returns the created node
as Tcl command. If this option was omitted, the command returns nothing. Each command creates
always the same type of node. Which type of node is created by the command is determined by
the first argument to the createNodeCmd. The syntax of the created command depends on the type
of the node it creates.
If the first argument of the method is elementNode, the created command will create an element
node. The tag name of the created node is commandName without namespace qualifiers. The syntax
of the created command is:
elementNodeCmd ?attributeName attributeValue ...? ?script?
elementNodeCmd ?-attributeName attributeValue ...? ?script?
elementNodeCmd name_value_list script
The command syntax allows three different ways to specify the attributes of the resulting ele-ment. element.
ment. These could be specified with attributeName attributeValue argument pairs, in an "option
style" way with -attriubteName attributeValue argument pairs (the '-' character is only syn-tactical syntactical
tactical sugar and will be stripped off) or as a Tcl list with elements interpreted as
attribute name and the corresponding attribute value. The attribute name elements in the list
may have a leading '-' character, which will be stripped off.
Every elementNodeCmd accepts an optional Tcl script as last argument. This script is evaluated
as recursive appendFromScript script with the node created by the elementNodeCmd as parent of
all nodes created by the script.
If the first argument of the method is textNode, the command will create a text node. The syn-tax syntax
tax of the created command is:
textNodeCmd ?-disableOutputEscaping? data
If the optional flag -disableOutputEscaping is given, the escaping of the ampersand character
(&) and the left angle bracket (<) inside the data is disabled. You should use this flag care-
fully.
If the first argument of the method is commentNode, or cdataNode, the command will create an
comment node or CDATA section node. The syntax of the created command is:
nodeCmd data
If the first argument of the method is piNode, the command will create a processing instruc-tion instruction
tion node. The syntax of the created command is:
piNodeCmd target data
dom setStoreLineColumn ?boolean?
If switched on, the DOM nodes will contain line and column position information for the origi-nal original
nal XML document after parsing. The default is, not to store line and column position informa-tion. information.
tion.
dom setNameCheck ?boolean?
If NameCheck is true, every method which expects an XML Name, a full qualified name or a pro-cessing processing
cessing instructing target will check, if the given string is valid according to his produc-tion production
tion rule. For commands created with the createNodeCmd method to be used in the context of
appendFromScript the status of the flag at creation time decides. If NameCheck is true at cre-ation creation
ation time, the command will check his arguments, otherwise not. The setNameCheck set this
flag. It returns the current NameCheck flag state. The default state for NameCheck is true.
dom setTextCheck ?boolean?
If TextCheck is true, every command which expects XML Chars, a comment, a CDATA section value
or a processing instructing value will check, if the given string is valid according to his
production rule. For commands created with the createNodeCmd method to be used in the context
of appendFromScript the status of the flag at creation time decides. If TextCheck is true at
creation time, the command will check his arguments, otherwise not.The setTextCheck method set
this flag. It returns the current TextCheck flag state. The default state for TextCheck is
true.
dom isName name
Returns 1, if name is a valid XML Name according to production 5 of the XML 1.0 recommenda-tion. recommendation.
tion. Otherwise it returns 0.
dom isPIName name
Returns 1, if name is a valid XML processing instruction target according to production 17 of
the XML 1.0 recommendation. Otherwise it returns 0.
dom isNCName name
Returns 1, if name is a valid NCName according to production 4 of the of the Namespaces in XML
recommendation. Otherwise it returns 0.
dom isQName name
Returns 1, if name is a valid QName according to production 6 of the of the Namespaces in XML
recommendation. Otherwise it returns 0.
dom isCharData string
Returns 1, if every character in string is a valid XML Char according to production 2 of the
XML 1.0 recommendation. Otherwise it returns 0.
dom isComment string
Returns 1, if string is a valid comment according to production 15 of the XML 1.0 recommenda-tion. recommendation.
tion. Otherwise it returns 0.
dom isCDATA string
Returns 1, if string is valid according to production 20 of the XML 1.0 recommendation. Other-wise Otherwise
wise it returns 0.
dom isPIValue string
Returns 1, if string is valid according to production 16 of the XML 1.0 recommendation. Other-wise Otherwise
wise it returns 0.
KEYWORDS
XML, DOM, Document, node, parsing
Tcl dom(n)
|