Skip Headers
Oracle® XML Developer's Kit Programmer's Guide
10g Release 2 (10.1.2)
Part No. B14033-01
  Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

13 Getting Started with XDK C Components

This chapter contains these topics:

Specifications of XDK C/C++ Components

Oracle XDK C/C++ components are built on W3C recommendations. The list of supported standards for release 10.1 are:

What Are the XDK C Components

XDK C components are the basic building blocks for reading, manipulating, transforming, and validating XML documents. Oracle XDK C components consist of the following:

  • XML Parser for C: checks if an XML document is well-formed, and optionally validates it against a DTD. The parser constructs an object tree which can be accessed via a DOM interface or operates serially via a SAX interface.

  • XSLT Processor for C: provides the ability to format an XML document according to a stylesheet bundled with the parser.

  • XVM: high performance XSLT transformation engine.

  • XML Schema Processor for C: supports parsing and validating XML files against an XML Schema definition file.


    See Also:

    "Using the XML Parser for C" for further discussion of the XDK C components.

Installing the C Components of XDK

If you have installed Oracle Database or Oracle Application Server, then you already have the XDK C components installed. You can also download the latest versions of XDK C components from OTN by following these steps:

  1. Navigate to http://www.oracle.com/technology/tech/xml/.

  2. Click the Software link in the right-hand bar.

  3. Logon with your OTN username and password (registration is free if you don't already have an account).

  4. Select the Windows or UNIX version to download.

  5. Accept all conditions in the licensing agreement.

  6. Click the appropriate *.tar.gz or *.zip file.

  7. Extract the files in the distribution:

    1. Choose a directory under which you would like the xdk directory and subdirectories to go.

    2. Change to that directory; then extract the XDK download archive file using:

      UNIX: tar xvfz xdk_xxx.tar.gz  
      Windows: use WinZip visual archive extraction tool
      

Setting the UNIX Environment

After installing the UNIX version of XDK, the directory structure is:

-$XDK_HOME 
     | - bin: executable files 
     | - lib: library files
     | - nls/data: Globalization Support data files(*.nlb)
     | - xdk 
          | - demo/c: demonstration code 
          | - doc/c: documentation 
          | - public: header files 
          | - mesg: message files (*.msb) 

Here are all the libraries that come with the UNIX version of XDK C components:

Table 13-1 XDK C Components Libraries

Component Library Notes
XML Parser

XSLT Processor

XML Schema Processor

libxml10.a XML Parser for C, which includes DOM, SAX, and XSLT APIs

XML Schema Processor for C


The XDK C components (UNIX) depend on the Oracle CORE and Globalization Support libraries in the following table:

Table 13-2 Dependent Libraries of XDK C Components on UNIX

Component Library Notes
CORE Library libcore10.a Oracle CORE library
Globalization Support Library libnls10.a

libunls10.a

Oracle Globalization Support common library

Oracle Globalization Support library for Unicode support


Command Line Environment Setup

The parser may be called as an executable by invoking bin/xml, which has the following options:

Table 13-3 Parser Command Line Options

Option Meaning
-c Conformance check only, no validation
-e encoding Specify default input file encoding ("incoding")
-E encoding Specify DOM/SAX encoding ("outcoding")
-f file File - Interpret as filespec, not URI
-h Help - show usage help and full list of flags
-i n Number of times to iterate the XSLT processing
-l language Language for error reporting
-n Traverse DOM and report number of elements
-o XSLoutfile Specify output file of XSLT processor
-p Print document after parsing
-r Do not ignore <xsl:output> instruction in XSLT processing
-s stylesheet Style sheet - specifies the XSL style sheet
-v Version - display parser version and then exit
-V var value To test top level variables in CXSLT
-w Whitespace - preserve all whitespace
-W Warning - stop parsing after a warning
-x SAX - exercise SAX interface and print document

Check if the environment variable ORA_NLS10 is set to point to the location of the Globalization Support data files. If you install the Oracle database, you can set it to be:

setenv ORA_NLS10 ${ORACLE_HOME}/nls/data 

If no Oracle database is installed, you can use the Globalization Support data files that come with the XDK release by setting:

setenv ORA_NLS10 ${XDK_HOME}/nls/data

Error message files are provided in the mesg subdirectory. Files ending in .msb are machine-readable and needed at runtime; files ending in .msg are human-readable and contain cause and action descriptions for each error. The messages files also exist in the $ORACLE_HOME/xdk/mesg directory.

If you do not have an ORACLE_HOME, check if the environment variable ORA_XML_MESG is set to point to the absolute path of the mesg directory. If the Oracle database is installed, you can set ORA_XML_MESG, although this is not required:

setenv ORA_XML_MESG ${ORACLE_HOME}/xdk/mesg 

If no Oracle database is installed, you must set the environment variable ORA_XML_MESG to point to the absolute path of the mesg subdirectory:

setenv ORA_XML_MESG ${XDK_HOME}/xdk/mesg 

The parser may also be invoked by writing code to use the supplied APIs. The code must be compiled using the headers in the include subdirectory and linked against the libraries in the lib subdirectory. See Makefile in the demo subdirectory for full details of how to build your program.

To get the XDK version you are using on UNIX:

strings libxml10.a | grep -i Version

Setting the Windows Environment

These are the Windows libraries that come with the XDK C components:

Table 13-4 XDK C Components Libraries on Windows

Component Library Notes
XML Parser

XSL Processor

XML Schema Processor

oraxml10.lib

oraxml10.dll

XML Parser for C, which includes DOM, SAX, and XSLT APIs

XML Schema Processor for C


The XDK C components (Windows) depend on the Oracle CORE and Globalization Support libraries in the following table:

Table 13-5 Dependent Libraries of XDK C Components on Windows

Component Library Notes
CORE Library oracore10.dll Oracle CORE library
Globalization Support Library oranls10.dll Oracle Globalization Support common library

Globalization Support Library oraunls10.dll Oracle Globalization Support library for Unicode support

Environment for Command Line Usage

For the parser and schema validator options, see Table 13-3, "Parser Command Line Options".

Check that the environment variable ORA_NLS10 is set to point to the location of the Globalization Support encoding definition files. You can set it this way:

setenv ORA_NLS10 %ORACLE_HOME%\nls\data

If no Oracle database is installed, you can use the Globalization Support encoding definition files that come with the XDK release (a subset of which are in the Oracle database):

set ORA_NLS10 =%XDK_HOME%\nls\data

Error message files are provided in the mesg subdirectory. Files ending in .msb are machine-readable and needed at runtime; files ending in .msg are human-readable and include cause and action descriptions for each error. The messages files also exist in the $ORACLE_HOME/xdk/mesg directory.

If there is an Oracle database installed, you can set ORA_XML_MESG, although this is not required:

set ORA_XML_MESG =%ORACLE_HOME%\xdk\mesg

If no Oracle database is installed, you must set the environment variable ORA_XML_MESG to point to the absolute path of the mesg subdirectory:

set ORA_XML_MESG =%XDK_HOME%\xdk\mesg

In order to compile the sample code, you set the path for the cl compiler.

Go to the Start Menu and select Settings > Control Panel. In the pop-up window of Control Panel, select System icon and double click. A window named System Properties pops up. Select Environment Tab and input the path of cl.exe to the PATH variable shown in Figure 13-1, "Setting the Path for the cl Compiler in Windows".

Figure 13-1 Setting the Path for the cl Compiler in Windows

Description of c2.gif follows
Description of the illustration c2.gif

You need to update the Make.bat by adding the path of the libraries and the header files to the compile and link commands as shown in the following example of a Make.bat file:

:COMPILE 
set filename=%1 
cl -c -Fo%filename%.obj %opt_flg% /DCRTAPI1=_cdecl /DCRTAPI2=_cdecl /nologo /Zl  
/Gy /DWIN32 /D_WIN32 /DWIN_NT /DWIN32COMMON /D_DLL /D_MT /D_X86_=1
/Doratext=OraText -I. -I..\..\..\include -
ID:\Progra~1\Micros~1\VC98\Include %filename%.c 
goto :EOF

:LINK 
set filename=%1 
link %link_dbg% /out:..\..\..\..\bin\%filename%.exe /libpath:%ORACLE_HOME%\lib
/libpath:D:\Progra~1\Micros~1\VC98\lib /libpath:..\..\..\..\lib %filename%.obj  
oraxml10.lib oracore10.lib oranls10.lib oraunls10.lib user32.lib kernel32.lib
msvcrt.lib ADVAPI32.lib oldnames.lib winmm.lib
:EOF 

where:

D:\Progra~1\Micros~1\VC98\Include: is the path for header files and D:\Progra~1\Micros~1\VC98\lib: is the path for library files.

Using the XDK C Components with Visual C++

If you are using Microsoft Visual C++ compiler:

Check that the environment variable ORA_NLS10 is set to point to the location of the Globalization Support data files.

In order to use Visual C++, you need to employ the system setup for Windows to define the environment variable.

Go to Start Menu and select Settings > Control Panel. In the pop up window of Control Panel, select System icon and double click. A window named System Properties pops up. Select Environment Tab and input ORA_NLS10, and its value d:\xdk\nls\data, as shown in Figure 13-2:

Figure 13-2 Setting Up the ORA_NLS10 Environment Variable

Description of oranls10.gif follows
Description of the illustration oranls10.gif

Check that the environment variable ORA_XML_MESG is set to point to the absolute path of the mesg directory.

In order for Visual C++ to use the environment variable, you need to employ the system setup for Windows to define the environment variable.

Go to the Start Menu and select Settings > Control Panel. In the pop-up window of Control Panel, select System icon and double click. A window named System Properties pops up. Select Environment Tab and input ORA_XML_MESG, as in Figure 13-3, (the illustrations show screens for a previous release).

Figure 13-3 Setting Up the ORA_XML_MESG Environment Variable

Description of c4.gif follows
Description of the illustration c4.gif

Figure 13-4 shows the setup of the PATH for DLLs:

Figure 13-4 Setup of the PATH for DLLs

Description of c5.gif follows
Description of the illustration c5.gif

After you open a workspace in Visual C++ and include the *.c files for your project, you must set the path for the project. Go to the Tools menu and select Options. A window will pop up. Select the Directory tab and set your include path as shown in Figure 13-5:

Figure 13-5 Setting Your Include Path in Visual C++

Description of c6.gif follows
Description of the illustration c6.gif

Then set your library path as shown in Figure 13-6:

Figure 13-6 Setting Your Static Library Path in Visual C++

Description of c7.gif follows
Description of the illustration c7.gif

After setting the paths for the static libraries in %XDK_HOME%\lib, you also need to set the library name in the compiling environment of Visual C++.

Go to the Project menu in the menu bar and select Settings. A window pops up. Please select the Link tab in the Object/Library Modules field enter the name of XDK C components libraries, as shown in Figure 13-7:

Figure 13-7 Setting Up the Static Libraries in Visual C++ Project

Description of cpp8.gif follows
Description of the illustration cpp8.gif

Optionally, compile and run the demo programs. Then you can start using C XDK components.

Globalization Support for the C XDK Components

The parser supports over 300 IANA character sets. These character sets include the following:

UTF-8, UTF-16, UTF16-BE, UTF16-LE, US-ASCII, ISO-10646-UCS-2, ISO-8859-{1-9, 13-15}, EUC-JP, SHIFT_JIS, BIG5, GB2312, GB_2312-80, HZ-GB-2312, KOI8-R, KSC5601, EUC-KR, ISO-2022-CN, ISO-2022-JP, ISO-2022-KR, WINDOWS-{1250-1258}, EBCDIC-CP-{US,CA,NL,WT,DK,NO,FI,SE,IT,ES,GB,FR,HE,BE,CH,ROECE,YU,IS,AR1}, IBM{037,273,277,278,280,284,285,297,420,424,437,500,775,850,852,855,857,00858, 860,861,863,865,866,869,870,871,1026,01140,01141,01142,01143,01144,01145,01146, 01147,01148}

Any alias of the above character sets that is found here may also be used. In addition, any character set specified in Appendix A, Character Sets, of the Oracle Database Globalization Support Guide can be used with the exception of IW7IS960.

However, it is recommended that you use IANA character set names for interoperability with other XML parsers. Also note that XML parsers are only required to support UTF-8 and UTF-16 so those character sets should be preferred.

In order to be able to use these encodings, you should have the ORACLE_HOME environment variable set and pointing to the location of your Oracle installation. This enables the use of the globalization support data files which contain data for all supported encodings. On UNIX systems, they are usually in $ORACLE_HOME/nls/data. On Windows, they are usually in %ORACLE_HOME%\nls\data. C and C++ XDK releases that are downloaded from OTN contain an nls/data subdirectory. You must set the environment variable ORA_NLS10 to the absolute path of the nls/data subdirectory if you do not have an Oracle installation.

The default input encoding ("incoding") is UTF-8. If an input document's encoding is not self-evident (by HTTP character set, Byte Order Mark, XMLDecl, and so on), then the default input encoding is assumed. It is recommended that you set the default encoding explicitly if using only single byte character sets (such as US-ASCII or any of the ISO-8859 character sets) since single-byte performance is by far the fastest. The flag XML_FLAG_FORCE_INCODING says that the default input encoding should always be applied to input documents, ignoring any BOM or XMLDecl. However, a protocol declaration (such as HTTP character set) is always honored.

The data encoding for DOM and SAX ("outcoding") should be chosen carefully. Single-byte encodings are the fastest, but can represent only a very limited set of characters. Next fastest is Unicode (UTF-16), and slowest are the multibyte encodings such as UTF-8. If input data cannot be converted to the outcoding without loss, an error occurs. So for maximum utility, a Unicode-based outcoding should be used, since Unicode can represent any character. If outcoding is not specified, it defaults to the incoding of the first document parsed.