Copyright © 2009-2059 NASA ARC®, All Rights Reserved.
The XML specification was created primarily as a markup language for documents and data exchange messages by specifying formatting rules for structured text files. The XML standard does not say what the information content is or how the information content is to be structured. A set of naming and design rules (NDR) is necessary because many different development options are available for creating XML Schema. For example, as a direct result of the inherent flexibility within the XML Schema standard developed by the World Wide Web Consortium (W3C), a developer can create an XML Schema in one part of an organization that is incompatible with a Schema from another part of the organization.
The problem is further compounded when one considers an extended enterprise across different organizational boundaries. While any one Schema may be perfectly valid with respect to the W3C Schema standard, no two organizations are assured of successful exchange of data using XML. By publishing a specification that allows for precise semantics to be retained in the schema and or documents, XML SchemaPlus aims to:
XML SchemaPlus is motivated by two tangible benefits. First, the XSP standardizes the development of XML Schema across Organizations. Second, and more important, XSP provides an XML framework that is compatible with the W3C Web Ontology Language (OWL), providing a foundation for semantic interoperability and reasoning in systems and applications.
Typical XML documents are tree structures having nested elements. The hierarchical style is one in which elements can contain other elements. The semantics of the containment are almost always left unstated. This means that there has to be an implicit understanding of what the nesting in the XML document (and associated schema) means in every client use. Sometimes this meaning is only stated as annotations in the XML Schema. Irrespective, this style is the dominant style for most people using XML and comes about because of a desire to view an XML document as a container-oriented structure (a tree). This, however, does not often serve to represent faithfully an information model (or structure). An example of a hierarchical XML Structure for a simple X3D file describing a green square is shown below.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE X3D PUBLIC "ISO//Web3D//DTD X3D 3.0//EN"
"http://www.web3d.org/specifications/x3d-3.0.dtd">
<X3D profile='Interchange' version='3.0'
xmlns:xsd='http://www.w3.org/2001/XMLSchema-instance'
xsd:noNamespaceSchemaLocation='http://www.web3d.org/specifications/x3d-3.0.xsd '>
<head>
...
</head>
<Scene>
<Transform translation='-1.0 0.0 0.0'>
<Shape>
<Appearance>
<Material diffuseColor='0.0 1.0 0.0'/>
</Appearance>
<Box/>
</Shape>
</Transform>
</Scene>
</X3D>
In the fragment of an XML document shown below, the relationship between a nested element and its surrounding element is left implicit; it is not clear if the element name is the thing of interest, or the relationship with the containing element. Further ambiguity arises because 'Library' contains 'Elements' as does a 'Campaign'. Is the relationship of 'Elements' to 'Library' the same as that for 'Elements' to 'Campaign'? This lack of explicit nesting semantics complicates the use of XML for interoperability.
<DataFile>
<Library>
<Elements>
<Element ObjectID="4:1" Name="Habitat Module" Nickname="Habitat - ECLSS" Mass="1234" />
<Element ObjectID="4:2" Name="Small Pressurized Rover" Nickname="SPR" Mass="4321" />
<Element ObjectID="4:3" Name="Solar Power Unit 4" Nickname="Solar Power Unit" Mass="123" />
</Elements>
</Library>
<Campaign>
<Locations>
<CampaignLocation ObjectID="1:3" Name="Kennedy Space Center" Nickname="Earth" />
<CampaignLocation ObjectID="1:4" Name="Moon - South Pole" Nickname="Lunar Outpost" />
</Locations>
<Elements>
<CampaignElement ObjectID="1:5" Name="SPR" Nickname="SPR-1" LocationID="1:3" CurrentMass="1234">
<Element ObjectID="4:2" Name="Small Pressurized Rover" Nickname="SPR" Mass="4321" />
</CampaignElement>
<CampaignElement ObjectID="1:6" Name="SPR" Nickname="SPR-1" LocationID="1:3" CurrentMass="6789">
<Element ObjectID="4:2" Name="Small Pressurized Rover" Nickname="SPR" Mass="9876" />
</CampaignElement>
</Elements>
</Campaign>
</DataFile>
The solution in XML SchemaPlus enables XML documents to express, either directly, or through their Schema, the intended semantics of nested elements, thereby making collections and container structures explicit. Additionally, XML Documents separate the role of an element from its type.
An example of explicit semantics in XSP is shown below:
TBD - Example
An example is shown below:
TBD - Example
A common issue with XML documents is imprecise semantics of XML references. Commonly, cross-referencing is done using names or IDs. Such names are context-dependent, with no rigorous method for resolving them. IDs can be recognized as such by XML, but there are limitations.
An example is shown below:
TBD - Example
Gleaning Resource Descriptions from Dialects of Languages (GRDDL) is a W3C specification for extracting RDF data out of XML and XHTML documents using explicitly associated transformation algorithms, typically written in XSLT. GRDDL uses the RDF property http://www.w3.org/2003/g/data-view#namespaceTransformation to link an XML Namespace to an interpreter that may be applied to any document which has its root element in that namespace, such that the output of the interpreter will be an RDF/XML form of some (or all) of the information content of the document. For instance, given the XML Namespace http://www.example.net/fooML, the following statement asserts that if an XML document has a root element in the http://www.example.net/fooML namespace, and it is run through the XSLT style sheet http://www.example.net/fooML2rdf.xsl then the result will be valid RDF/XML which is information that can be considered to have been expressed by the document.
>rdf:Description rdf:about="http://www.example.net/fooML"<
>namespaceTransformation xmlns="http://www.w3.org/2003/g/data-view#"
rdf:resource="http://www.example.net/fooML2rdf.xsl"/<
>/rdf:Description<
The semantics of the XML are not handled at all by GRDDL. Interpretation of the document is left to the XSLT script
The design and specification of XML SchemaPlus (XSP) schemas should be as rigorous an activity as designing and developing code or designing database Schemas. As such, when creating an XML Schema the designer should be working within a development process and working to a set of design guidelines and coding standards. XSP Schemas should be reviewed for accuracy and compliance with guidelines and standards. Each of the principles in the list below should be viewed as a general requirement for all XSP Schemas that are intended to be used in a deployed system or application.
An XSP Schema is a contract published so that both the creator of an XML document and the recipient of an XML document can verify that the instance document obeys the contract. When designing an XSP Schema, designers should constrain the values for all the elements and attributes that the application uses and relies on to the set of values that the application can handle. A valid document should imply valid data within the limits of what can be specified by the XSP Schema language. It is important to understand that we are ensuring that the full application required state is defined in the schema, but not restricting the document to hold the application state (see extensibility below).
Schemas should be designed to be extensible. That is, new elements and attributes can be inserted throughout the schema. Extension points can be made explicit if the schema is being designed to be enhanced.
The design of the XSP schema must facilitate ease-of-maintenance and ease-of-migration by XSP Users.
XSP Schemas should not specify default values for attributes and element content. This is because default values may cause XSP Schema validation to produce erroneous results. XSP Schema may also be used at run time to check that XML instance documents conform to their XSP Schema. As validating an XML document against its XSP Schema is an expensive operation it is common to omit schema validation in a deployed system. This may in some situations be unwise.
XSP Schemas should import and include other XSP Schema files rather than duplicating types and elements locally.
Schema design balances prescriptiveness in any one usage scenario with that across the breadth of usage scenarios envisioned. Having precise, tight content models and data types is highly recommended practice for ensuring integration and interoperability.
XSP Schemas should be specified in such a way that types and elements can be leveraged by other XSP Schemas. The design of reusable XSP components should contain as many common features as possible. The range of possible uses of XML within any organization is extremely broad; the purpose of XSP is to enable organizations to share common semantics. To ensure reuse, every type defined in an XSP Schema that is the content type of an attribute or an element should be defined globally (i.e., at the top level in the Schema). Types that are defined globally can be reused in other XSP Schemas. Groups should also be used to name collections of elements and attributes that can be reused. XSP Schema provides GROUP for elements and AttributeGroup for Attributes.
An XSP Schema should define, for one or more target XML documents, each and every element and attribute that is understood by a solution when processing target documents. For example, if an application uses the value of an attribute or element then a definition for that item should be included in the XSP Schema.
XML Documents created by XSP Schemas should be readily map-able or translatable to OWL/RDF models for semantic processing.
XSP Schemas should be clear, consistent and unambiguous. They should contain human readable documentation and, where appropriate, links to requirements or design documents.