Skip Headers
Oracle® XML Developer's Kit Programmer's Guide
10g Release 2 (10.1.2)
Part No. B14033-01
  Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

5 XML Schema Processor for Java

This chapter contains these topics:

What Is XML Schema?

XML Schema was created by the W3C to use XML itself to describe the content and the structure of XML documents. It includes most of the capabilities (it does not support entity) of Document Type Description (DTD) and additional capabilities.

What Are DTDs?

A DTD is a mechanism provided by XML 1.0 for declaring constraints on XML markup. DTDs allow the specification of the following:

DTDs are also known as XML Markup Declarations.

XML Schema language serves a similar purpose to DTDs, but it is more flexible in specifying XML document constraints and potentially more useful for certain applications. Namespace support and datatypes support for elements and attributes are both found in XML Schema.

XML Schema is also referred to as XML Schema Definition (XSD).

DTD Limitations

DTDs are considered to be deficient in handling certain applications. DTD limitations include:

  • DTD is not integrated with Namespace technology so users cannot import and reuse code

  • DTD does not support datatypes other than character data, a limitation for describing metadata standards and database schemas

Applications need to specify document structure constraints more flexibly than the DTD can.

Comparison of XML Schema Features to DTD Features

Because of the inherent limitations of DTDs, the W3C is promoting XML Schema. XML Schema enables you to specify type information and constraints.

Table 5-1 lists XML Schema features compared to DTD features. Note that most XML Schema features include DTD features.

Table 5-1 XML Schema Features Compared to DTD Features

XML Schema Feature DTD Feature
Built-In Datatypes

XML schema specifies a set of built-in datatypes. Some of them are defined and called primitive datatypes, and they form the basis of the type system:

string, boolean, float, decimal, double, duration, dateTime, time, date, gYearMonth, gYear, gMonthDay, gMonth, gDay, Base64Binary, HexBinary, anyURI, NOTATION, QName.

Others are derived datatypes that are defined in terms of primitive types.

DTDs do not support datatypes other than character strings.

User-Defined Datatypes

Users can derive their own datatypes from the built-in datatypes. There are three ways of datatype derivation: restriction, list and union.

Restriction defines a more restricted datatype by applying constraining facets to the base type

list simply allows a list of values of its item type

union defines a new type whose value can be of any of its member types

For example, to specify that the value of publish-year type to be within a specific range:

<SimpleType name = "publish-year">
    <restriction base="gYear">
         <minInclusive value="1970"/>
         <maxInclusive value="2000"/>
    </restriction>
</SimpleType>

The constraining facets are:

length, minLength, maxLength, pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive, totalDigits, fractionDigits.

Some facets only apply to certain base types.

The publish-year element in the DTD example cannot be constrained further.

Occurrence Indicators (Content Model or Structure)

In XML Schema, the structure (called complexType) of the instance document or an element is defined in terms of model group and attribute group. A model group may further contain model groups or element particles, while attribute group contains attributes. Wildcards can be used in both model group and attribute group to indicate any element or attribute. There are three kinds of model group: sequence, all, and choice, representing the sequence, conjunction and disjunction relationships among particles respectively. The range of the number of occurrence of each particle can also be specified.

Like the datatype, complexType can be derived from other types. The derivation method can be either restriction or extension. The derived type inherits the content of the base type plus corresponding modifications. In addition to inheritance, a type definition can make references to other components. This feature allows a component to be defined once and used in many other structures.

The type declaration and definition mechanism in XML Schema is much more flexible and powerful than the DTD.

Control by DTDs over the number of child elements in an element are assigned with the following symbols:

  • ? = zero or one.

  • * = zero or more

  • + = one or more

  • (none) = exactly one

Identity Constraints

XML Schema extends the concept of XML ID/IDREF mechanism with the declarations of unique, key and keyref. They are part of the type definition and allow not only attributes, but also element contents as keys. Each constraint has a scope within which it holds and the comparison is in terms of their value rather than lexical strings.

-
Import/Export Mechanisms (Schema Import, Inclusion and Modification)

All components of a schema need not be defined in a single schema file. XML Schema provides a mechanism of assembling multiple schemas. Import is used to integrate schemas of different namespace while inclusion is used to add components of the same namespace. Components can also be modified using redefinition when included.

You cannot use constructs defined in external schemas.


XML Schema Processor for Java Features

XML Schema Processor for Java, which is a part of the Oracle XDK Java components, has the following features:

Supported Character Sets

XML Schema Processor for Java supports documents in the following encodings:

  • BIG

  • EBCDIC-CP-*

  • EUC-JP

  • EUC-KR

  • GB2312

  • ISO-2022-JP

  • ISO-2022-KR

  • ISO-8859-1to -9

  • ISO-10646-UCS-2

  • ISO-10646-UCS-4

  • KOI8-R

  • Shift_JIS

  • US-ASCII

  • UTF-8

  • UTF-16

Requirements to Run XML Schema Processor for Java

To run XML Schema Processor for Java, you need the following:

  • Any operating system with Java 1.2 support

  • Java: JDK 1.2.x or higher.

Documentation for sample programs for Oracle XML Schema Processor for Java is located in the file xdk/demo/java/schema/README.

XML Schema Processor for Java Usage

As shown in Figure 5-1, Oracle's XML Schema Processor for Java performs two major tasks:

Figure 5-1 XML Schema Processor for Java Usage

Description of adxdk105.gif follows
Description of the illustration adxdk105.gif

XML Schema can be used to define a class of XML documents. Instance document describes an XML document that conforms to a particular schema.

Although these instances and schemas need not exist specifically as "documents", they are commonly referred to as files. They may exist as any of the following:

When building the schema, the builder first compiles an internal schema object, and then calls the DOM Parser to parse the schema object into a corresponding DOM tree.

The validator works as a filter between the SAX Parser and your applications for the instance document. The validator takes SAX events of the instance document as input and validates them against the schema. If the validator detects invalid XML components it sends an error messages.

The output of the validator is:

Using the XML Schema API

The API of the XML Schema Processor for Java is simple. You can either use either of the following:

  • setSchemaValidationMode() in the DOMParser as shown in XSDSample.java.

  • Explicitly build the schema using XSDBuilder and set the schema for XMLParser as shown in XSDSetSchema.java.

If you do not explicitly set a compiled schema for validation using XSDBuilder, make sure that your instance document has the correct xsi:schemaLocation attribute pointing to the schema file. Otherwise, the validation will not be performed.

There is no clean-up call similar to xmlclean. If you need to release all memory and reset the state before validating a new XML document, terminate the context and start over.

XML Schema Processor for Java Sample Programs

The sample XML Schema Processor for Java files provided in the directory /xdk/demo/java/schema are described in Table 5-2:

Table 5-2 XML Schema Sample Files

File Description
cat.xsd
The sample XML Schema definition file that supplies input to the XSDSetSchema.java program. XML Schema Processor for Java uses the XML Schema specification from cat.xsd to validate the contents of catalogue.xml.
catalogue.xml
The sample XML file that is validated by XML Schema processor against the XML Schema definition file, cat.xsd, using the program, XSDSetSchema.java.
catalogue_e.xml
When XML Schema Processor for Java processes this sample XML file using XSDSample.java, it generates XML Schema errors.
DTD2Schema.java
This sample program converts a DTD (first argument) into an XML Schema and uses it to validate an XML file (second argument).
report.xml
The sample XML file that is validated by XML Schema Processor for Java against the XML Schema definition file, report.xsd, using the program, XSDSetSchema.java.
report.xsd
The sample XML Schema definition file that is input to the XSDSetSchema.java program. XML Schema Processor for Java uses the XML Schema specification from report.xsd to validate the contents of report.xml.
report_e.xml
When XML Schema Processor for Java processes this sample XML file using XSDSample.java, it generates XML Schema errors.
XSDSample.java
Sample XML Schema Processor for Java program.
XSDSetSchema.java
When this example is run with cat.xsd and catalogue.xml, XML Schema Processor for Java uses the XML Schema specification from cat.xsd to validate the contents of catalogue.xml.
XSDLax.java
This example uses SCHEMA_LAX_VALIDATION.
embeded_xsql.xsd
The input file for XSDLax.java.
embeded_xsql.xml
The output file from XSDLax.java.

To run the sample programs:

  1. Execute the program make to generate .class files.

  2. Add xmlparserv2.jar, and the current directory to the CLASSPATH.

The following steps can be done in any order: