Skip Headers
Oracle® XML Developer's Kit Programmer's Guide
10g Release 2 (10.1.2)
Part No. B14033-01
  Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

19 XML Parser for C++

This chapter contains these topics:


Note:

Use the new unified C++ API in xml.hpp for new XDK applications. The old C++ API in oraxml.hpp is deprecated and supported only for backward compatibility, but will not be enhanced. It will be removed in a future release.

Introduction to Parser for C++

Oracle XML Parser for C++ checks if an XML document is well-formed, and optionally validates it against a DTD or XML schema. The parser constructs an object tree which can be accessed through one of the following two XML APIs:

Tree-based APIs are useful for a wide range of applications, but they often put a great strain on system resources, especially if the document is large (under very controlled circumstances, it is possible to construct the tree in a lazy fashion to avoid some of this problem). Furthermore, some applications need to build their own, different data trees, and it is very inefficient to build a tree of parse nodes, only to map it onto a new tree.

Dom Namespace

This is the namespace for DOM-related types and interfaces.

DOM interfaces are represented as generic references to different implementations of the DOM specification. They are parameterized by Node that supports various specializations and instantiations. Of them, the most important is xmlnode which corresponds to the current C implementation

These generic references do not have a NULL-like value. Any implementation must never create a reference with no state (like NULL). If there is a need to signal that something has no state, an exception should be thrown.

Many methods might throw the SYNTAX_ERR exception, if the DOM tree is incorrectly formed, or throw UNDEFINED_ERR, in the case of wrong parameters or unexpected NULL pointers. If these are the only errors that a particular method might throw, it is not reflected in the method signature.

Actual DOM trees do not depend on the context, TCtx. However, manipulations on DOM trees in the current, xmlctx-based implementation require access to the current context, TCtx. This is accomplished by passing the context pointer to the constructor of DOMImplRef. In multithreaded environment DOMImplRef is always created in the thread context and, so, has the pointer to the right context.

DOMImplRef provides a way to create DOM trees. DomImplRef is a reference to the actual DOMImplementation object that is created when a regular, non-copy constructor of DomImplRef is invoked. This works well in a multithreaded environment where DOM trees need to be shared, and each thread has a separate TCtx associated with it. This works equally well in a single threaded environment.DOMString is only one of the encodings supported by Oracle implementations. The support of other encodings is an Oracle extension. The oratext* data type is used for all encodings.Interfaces represent DOM level 2 Core interfaces according to http://www.w3.org/TR/DOM-Level-2-Core/core.html. These C++ interfaces support the DOM specification as closely as possible. However, Oracle cannot guarantee that the specification is fully supported by our implementation because the W3C specification does not cover C++ binding.

DOM Datatypes

DATATYPE DomNodeType - Defines types of DOM nodes.

DATATYPE DomExceptionCode - Defines exception codes returned by the DOM API.

DOM Interfaces

DOMException Interface - See exception DOMException in the W3C DOM documentation. DOM operations only raise exceptions in "exceptional" circumstances: when an operation is impossible to perform (either for logical reasons, because data is lost, or because the implementation has become unstable). The functionality of XMLException can be used for a wider range of exceptions.

NodeRef Interface - See interface Node in the W3C documentation.

DocumentRef Interface - See interface Document in the W3C documentation.

DocumentFragmentRef Interface - See interface DocumentFragment in the W3C documentation.

ElementRef Interface - See interface Element in the W3C documentation.

AttrRef Interface - See interface Attr in the W3C documentation.

CharacterDataRef Interface - See interface CharacterData in the W3C documentation.

TextRef Interface - See Text nodes in the W3C documentation.

CDATASectionRef Interface - See CDATASection nodes in the W3C documentation.

CommentRef Interface - See Comment nodes in the W3C documentation.

ProcessingInstructionRef Interface - See PI nodes in the W3C documentation.

EntityRef Interface - See Entity nodes in the W3C documentation.

EntityReferenceRef Interface - See EntityReference nodes in the W3C documentation.

NotationRef Interface - See Notation nodes in the W3C documentation.

DocumentTypeRef Interface - See DTD nodes in the W3C documentation.

DOMImplRef Interface - See interface DOMImplementation in the W3C DOM documentation. DOMImplementation is fundamental for manipulating DOM trees. Every DOM tree is attached to a particular DOM implementation object. Several DOM trees can be attached to the same DOM implementation object. Each DOM tree can be deleted and deallocated by deleting the document object. All DOM trees attached to a particular DOM implementation object are deleted when this object is deleted. DOMImplementation object is not visible to the user directly. It is visible through class DOMImplRef. This is needed because of requirements in the case of multithreaded environments

NodeListRef Interface - Abstract implementation of node list. See interface NodeList in the W3C documentation.

NamedNodeMapRef Interface - Abstract implementation of a node map. See interface NamedNodeMap in the W3C documentation.

DOM Traversal and Range Datatypes

DATATYPE AcceptNodeCode defines values returned by node filters provided by the user and passed to iterators and tree walkers.

DATATYPE WhatToShowCode specifies codes to filter certain types of nodes.

DATATYPE RangeExceptionCode specifies Exception kinds that can be thrown by the Range interface.

DATATYPE CompareHowCode specifies kinds of comparisons that can be done on two ranges.

DOM Traversal and Range Interfaces

NodeFilter Interface - DOM 2 Node Filter.

NodeIterator Interface - DOM 2 Node Iterator.

TreeWalker Interface - DOM 2 TreeWalker.

DocumentTraversal Interface - DOM 2 interface.

RangeException Interface - Exceptions for DOM 2 Range operations.

Range Interface - DOM 2 Range.

DocumentRange Interface - DOM 2 interface.

Parser Namespace

DOMParser Interface - DOM parser root class.

GParser Interface - Root class for XML parsers.

ParserException Interface - Exception class for parser and validator.

SAXHandler Interface - Root class for current SAX handler implementations.

SAXHandlerRoot Interface - Root class for all SAX handlers.

SAXParser Interface - Root class for all SAX parsers.

SchemaValidator Interface - XML schema-aware validator.

GParser Interface

GParser Interface - Root class for all XML parser interfaces and implementations. It is not an abstract class, that is, it is not an interface. It is a real class that allows users to set and check parser parameters.

DOMParser Interface

DOMParser Interface - DOM parser root abstract class or interface. In addition to parsing and checking that a document is well formed, DOMParser provides means to validate the document against DTD or XML schema.

SAXParser Interface

SAXParser Interface - Root abstract class for all SAX parsers.

SAX Event Handlers

To use SAX, a SAX event handler class should be provided by the user and passed to the SAXParser in a call to parse() or set before such call.

SAXHandlerRoot Interface - root class for all SAX handlers.

SAXHandler Interface - root class for current SAX handler implementations.

Thread Safety

If threads are forked off somewhere in the midst of the init-parse-term sequence of calls, you will get unpredictable behavior and results.

XML Parser for C++ Usage

  1. A call to Tools::Factory to create a parser initializes the parsing process.

  2. The XML input can be any of the InputSource kinds (see IO namespace).

  3. DOMParser invocation results in the DOM tree.

  4. SAXParser invocation results in SAX events.

  5. A call to parser destructor terminates the process.

XML Parser for C++ Default Behavior

The following is the XML Parser for C++ default behavior:

C++ Sample Files

xdk/demo/cpp/parser/ directory contains several XML applications to illustrate how to use the XML Parser for C++ with the DOM and SAX interfaces.

Change directories to the sample directory ($ORACLE_HOME/xdk/demo/cpp on Solaris, for example) and read the README file. This will explain how to build the sample programs.

Table 19-1 lists the sample files in the directory. Each file *Main.cpp has a corresponding *Gen.cpp and *Gen.hpp.

Table 19-1 XML Parser for C++ Sample Files

Sample File Name Description
DOMSampleMain.cpp
Sample usage of C++ interfaces of XML Parser and DOM.
FullDOMSampleMain.cpp
Manually build DOM and then exercise.
SAXSampleMain.cpp
Source for SAXSample program.