Comparative Examples of XML Schema and Query Languages
Angela Bonifati
Dipartimento di Elettronica e Informazione, Politecnico di Milano
Piazza Leonardo da Vinci 32, I-20133 Milano, Italy
Email: bonifati@elet.polimi.it
Dongwon Lee
Department of Computer Science, University of California at Los Angeles
Los Angeles, CA 90095, USA
Email: dongwon@cs.ucla.edu
1 Introduction
In this article, we present comparative examples of six XML schema languages
(XML-DTD [4],
XML-Schema [3, 26],
RELAX [20, 14],
SOX [8], Schematron [15, 19],
DSD [16, 17])
and
six XML query languages
(Lorel [2, 1, 12],
XML-QL [9, 10],
XML-GL [5],
XSLT [7, 25, 13],
XQL [24, 22, 23, 21,
, 11, 25]
and Quilt [6]).
Such comparative examples for real-life
scenario greatly helped us to understand the languages and to realize the
differences among them.
The underlying case study we have considered is that of a car dealer
office,
with documents
from different auto dealers and brokers, originally
proposed by David Maier in a Position Paper
``Database Desiderata for an XML Query
Language'' [18]. We have slightly modified them and suitably
fitted them to our
context. In particular, where needed, we have added to these documents
other
possible XML items in order to evaluate all the
possible constructs of the languages.
2 Six Schema Languages
The running scenario for schema languages is based on the case of a car
dealer CarDealer which handles documents regarding several Manufacturer and Vehicle.
2.1 Schema 1: Content Model & Attribute
The first example is focused on the definition of the CarDealer schema whose sub-elements
consist of a sequence of Address, Manufacturer, and Vehicle elements,
where the following constraints hold:
-
Each CarDealer has a name as a mandatory attribute, whose valid values can be either ``A'', ``B'', or ``C''.
- Address is optional and has a string type.
- CarDealer can have 0 or many Manufacturer.
- CarDealer can have 1 or many Vehicle.
- Manufacturer has sub-elements of MnName, Year,
Model, and Rank.
- Vehicle has sub-elements of Vendor, Make,
Model, Year, and Color.
- MnName, Model, Rank, Vendor, Make, and Color
are of string type while Year is of integer type.
2.1.1 DTD
<!ELEMENT CarDealer (Address?, Manufacturer*, Vehicle+)>
<!ATTLIST CarDealer name (A|B|C) #REQUIRED>
<!ELEMENT Address (#PCDATA)>
<!ELEMENT Manufacturer (MnName, Year, Model, Rank?)>
<!ELEMENT Vehicle (Vendor, Make, Model, Color?)>
<!ELEMENT MnName (#PCDATA)>
<!ELEMENT Year (#PCDATA)>
<!ELEMENT Model (#PCDATA)>
<!ELEMENT Make (#PCDATA)>
<!ELEMENT Rank (#PCDATA)>
<!ELEMENT Color (#PCDATA)>
2.1.2 XML-Schema
<element name='CarDealer'>
<complexType content='elementOnly'>
<element ref='t:Address' minOccurs='0' maxOccurs='1'/>
<element ref='t:Manufacturer' minOccurs='0' maxOccurs='unbounded'/>
<element ref='t:Vehicle' maxOccurs='unbounded'/>
<attribute name='name' use='required'>
<simpleType base='string' >
<enumeration value='A'/>
<enumeration value='B'/>
<enumeration value='C'/>
</simpleType>
</attribute>
</complexType>
</element>
<element name='Address'>
<complexType content='mixed'> </complexType>
</element>
<element name='Manufacturer'>
<complexType content='elementOnly'>
<element ref='t:MnName'/>
<element ref='t:Year'/>
<element ref='t:Model'/>
<element ref='t:Rank' minOccurs='0' maxOccurs='1'/>
</complexType>
</element>
<element name='Vehicle'>
<complexType content='elementOnly'>
<element ref='t:Vendor'/>
<element ref='t:Make'/>
<element ref='t:Model'/>
<element ref='t:Color' minOccurs='0' maxOccurs='1'/>
</complexType>
</element>
<element name='MnName'>
<complexType content='mixed'> </complexType>
</element>
<element name='Year' type='integer'/>
<element name='Model'>
<complexType content='mixed'> </complexType>
</element>
<element name='Make'>
<complexType content='mixed'> </complexType>
</element>
<element name='Rank'>
<complexType content='mixed'> </complexType>
</element>
<element name='Color'>
<complexType content='mixed'> </complexType>
</element>
2.1.3 RELAX
<tag name="CarDealer">
<attribute name="name" required="yes" type="NMTOKEN">
<enumeration value="A"/>
<enumeration value="B"/>
<enumeration value="C"/>
</attribute>
</tag>
<elementRule role="CarDealer">
<sequence>
<ref label="Address" occurs="?"/>
<ref label="Manufacturer" occurs="*"/>
<ref label="Vehicle" occurs="+"/>
</sequence>
</elementRule>
<tag name="Address"/>
<elementRule role="Address" type="string"/>
<tag name="Manufacturer"/>
<elementRule role="Manufacturer">
<sequence>
<ref label="MnName/>
<ref label="Year"/>
<ref label="Model"/>
<ref label="Rank" occurs="?"/>
</sequence>
</elementRule>
<tag name="Vehicle"/>
<elementRule role="Vehicle">
<sequence>
<ref label="Vendor/>
<ref label="Make"/>
<ref label="Model"/>
<ref label="Color" occurs="?"/>
</sequence>
</elementRule>
<tag name="MnName"/>
<elementRule role="MnName" type="string"/>
<tag name="Year"/>
<elementRule role="Year" type="int"/>
<tag name="Model"/>
<elementRule role="Model" type="string"/>
<tag name="Rank"/>
<elementRule role="Rank" type="string"/>
<tag name="Make"/>
<elementRule role="Make" type="string"/>
<tag name="Color"/>
<elementRule role="Color" type="string"/>
2.1.4 SOX
<elementtype name="CarDealer">
<model>
<sequence>
<element name="Address" type="string" occurs="0,1"/>
<element name="ManufacturerTYPE" type="Manufacturer" occurs="0,*"/>
<element name="VehicleTYPE" type="Vehicle" occurs="1,*"/>
</sequence>
<attdef name="name" datatype="ThreeVendors"> <required/> </attdef>
</model>
</elementtype>
<datatype name="ThreeVendors">
<enumeration datatype="NMTOKEN">
<option> A </option>
<option> B </option>
<option> C </option>
</enumeration>
</datatype>
<elementtype name="Manufacturer">
<model>
<sequence>
<element name="MnName" type="string"/>
<element name="Year" type="int"/>
<element name="Model" type="string"/>
<element name="Rank" type="string" occurs="0,1"/>
</sequence>
</model>
</elementtype>
<elementtype name="Vehicle">
<model>
<sequence>
<element name="Vendor" type="string"/>
<element name="Make" type="string"/>
<element name="Model" type="string"/>
<element name="Color" type="string" occurs="0,1"/>
</sequence>
</model>
</elementtype>
2.1.5 Schematron
<rule context="CarDealer">
<assert test="count(*) = count(Address | Manufacturer | Vehicle | @name)">
Unexpected element(s) or attribute(s) are found. </assert>
<assert test="Vehicle">CarDealer element should contain at least 1 Vehicle element .</assert>
<assert test="@name">Manufacturer element should contain an attribute called name.</assert>
</rule>
<rule context="@name">
<assert test="string-length(text()) > 0"> Attribute name is string type.</assert>
</rule>
<rule context="Manufacturer">
<assert test="count(*) = count(MnName | Year | Model | Rank)">
Unexpected element(s) are found.</assert>
<assert test="count(*) = count(element::*)">Unexpected attribute(s) are found.</assert>
<assert test="MnName">Manufacturer element should contain at least 1 MnName element .</assert>
<assert test="Year">Manufacturer element should contain at least 1 Year element .</assert>
<assert test="Model">Manufacturer element should contain at least 1 Model element .</assert>
</rule>
<rule context="Year">
<assert test="floor(.) = number(.)"> Year is integer type.</assert>
</rule>
<rule context="Vehicle">
<assert test="count(*) = count(Vendor | Make | Model | Color)">
Unexpected element(s) are found.</assert>
<assert test="count(*) = count(element::*)">Unexpected attribute(s) are found.</assert>
<assert test="Vendor">Vehicle element should contain at least 1 Vendor element .</assert>
<assert test="Make">Vehicle element should contain at least 1 Make element .</assert>
<assert test="Model">Vehicle element should contain at least 1 Model element .</assert>
</rule>
2.1.6 DSD
<ElementDef ID="CarDealer">
<AttributeDecl Name="name" Optional="no">
<OneOf>
<String value="A"/>
<String value="B"/>
<String value="C"/>
</OneOf>
</AttributeDecl>
<Sequence>
<Optional>
<Element Name="Address"/>
<StringType IDRef="str"/>
</Element>
</Optional>
<ZeroOrMore> <Element IDRef="Manufacturer"/> </ZeroOrMore>
<OneOrMore> <Element IDRef="Vehicle"/> </OneOrMore>
</Sequence>
</ElementDef>
<ElementDef ID="Manufacturer">
<Sequence>
<Element Name="MnName"/> <StringType IDRef="str"/> </Element>
<Element Name="Year"/> <StringType IDRef="int"/> </Element>
<Element Name="Model"/> <StringType IDRef="str"/> </Element>
<Optional> <Element Name="Rank"/> <StringType IDRef="str"/> </Element> </Optional>
</Sequence>
</ElementDef>
<ElementDef ID="Vehicle">
<Sequence>
<Element Name="Vendor"/> <StringType IDRef="str"/> </Element>
<Element Name="Make"/> <StringType IDRef="str"/> </Element>
<Element Name="Model"/> <StringType IDRef="str"/> </Element>
<Optional> <Element Name="Color"/> <StringType IDRef="str"/> </Element> </Optional>
</Sequence>
</ElementDef>
<StringTypeDef ID="str">
<ZeroOrMore> <AnyChar/> </ZeroOrMore>
</StringTypeDef>
<StringTypeDef ID="int">
<ZeroOrMore> <CharRange Start="0" End="9"/> </ZeroOrMore>
</StringTypeDef>
2.1.7 Comments
-
DTD cannot express datatype constraints other than ``string'' types as shown in the example in the Year type.
- XML-Schema fully covers the given content model. Within the CarDealer element,
sub-elements are defined through references and their definition is postponed. The string type is expressed by means of
the ``mixed'' type.
- The given content model is also feasible in RELAX.
- SOX defines three áelementtypeñ using sequence models. Valid values for the attribute name of CarDealer are defined elsewhere by enumeration.
- In Schematron, the structural constraints are defined by asserting the existence of certain elements or attributes.
- DSD uses áElementDefñ to define three main elements CarDealer, Manufacturer,
and Vehicle. Since DSD does not support built-in types, basic string and integer types are simulated
through áStringTypeDefñ and regular expression constructs.
2.2 Schema 2: Being Unique and Key
We want to add to the Schema 1 the following unique and key constraints:
-
Each Manufacturer has a unique MnName sub-element value.
- Each Vehicle has a unique VIN attribute value.
- All Make names of Vehicle element must be found in the MnName names
of Manufacturer element.
For the sake of brevity, only relevant portions are shown below.
2.2.1 DTD
<!ELEMENT Manufacturer (MnName, Year, Model, Rank?)>
<!ELEMENT Vehicle (Vendor, Make, Model, Color?)>
<!ATTLIST Vehicle VIN ID>
2.2.2 XML-Schema
<element name='Manufacturer'>
<complexType content='elementOnly'>
<element ref='t:MnName'/>
<element ref='t:Year'/>
<element ref='t:Model'/>
<element ref='t:Rank' minOccurs='0' maxOccurs='1'/>
</complexType>
</element>
<unique>
<selector>CarDealer/Manufacturer</selector>
<field>MnName</field>
</unique>
<element name='Vehicle'>
<complexType content='elementOnly'>
<element ref='t:Vendor'/>
<element ref='t:Make'/>
<element ref='t:Model'/>
<element ref='t:Color' minOccurs='0' maxOccurs='1'/>
<attribute name='VIN' type='string'> </attribute>
</complexType>
</element>
<unique>
<selector>CarDealer/Vehicle</selector>
<field>@VIN</field>
</unique>
<key name='HERE'>
<selector>CarDealer/Manufacturer</selector>
<field>MnName</field>
</key>
<keyref refer='HERE'>
<selector>CarDealer/Vehicle</selector>
<field>Make</field>
</keyref>
2.2.3 RELAX
<tag name="Vehicle">
<attribute name="VIN" type="ID">
</tag>
<elementRule role="Vehicle">
<sequence>
<ref label="Vendor/>
<ref label="Make"/>
<ref label="Model"/>
<ref label="Color" occurs="?"/>
</sequence>
</elementRule>
2.2.4 SOX
<elementtype name="Vehicle">
<model>
<sequence>
<element name="Vendor" type="string"/>
<element name="Make" type="string"/>
<element name="Model" type="string"/>
<element name="Color" type="string" occurs="0,1"/>
</sequence>
<attdef name="VIN" datatype="ID"> </attdef>
</model>
</elementtype>
2.2.5 Schematron
<rule context="Manufacturer">
<assert test="MnName and count(MnName) = 1"> MnName element must be unique. </assert>
</rule>
<rule context="Vehicle">
<assert test="@VIN and count(@VIN) = 1"> VIN attribute must be unique. </assert>
</rule>
2.2.6 DSD
<ElementDef ID="Vehicle">
<AttributeDecl Name="VIN" IDType="ID"/>
<Sequence>
<Element Name="Vendor"/> <StringType IDRef="str"/> </Element>
<Element Name="Make"/>
<StringType IDRef="str"/>
<PointsTo>
<Context> <Element Name="MnName"/> </Context>
</PointsTo>
</Element>
<Element Name="Model"/> <StringType IDRef="str"/> </Element>
<Optional> <Element Name="Color"/> <StringType IDRef="str"/> </Element> </Optional>
</Sequence>
</ElementDef>
2.2.7 Comments
-
DTD can express the given constraints only partially; in fact, it can only describe the uniqueness of
VIN attribute of Vehicle. Others must be expressed as simply string types and an application program
should be responsible for handling the differences. In addition, foreign key constraints for non-attributes
cannot be expressed in DTD.
- Due to the expressive áuniqueñ and ákeyñ constructs,
XML-Schema can fully implement the considered constraints.
- RELAX can solely express the unique attribute VIN for Vehicle.
- SOX can only express the unique attribute VIN for Vehicle.
- In Schematron, the unique element MnName and attribute VIN are rendered by using
count() function.
- DSD supports the unique VIN attribute and the foreign key constraints
by applying the áPointsToñ construct.
2.3 Schema 3: Context Sensitive Definition
In this example we want to add context-sensitive rules to Schema 2. Suppose we need to define two
different Model elements whose definitions are different depending on the context:
-
Model element contained in Manufacturer element contains as sub-elements AnotherModelName, but
- Model element contained in Vehicle element is of string type.
For instance, the following XML snippet illustrates the different usage of Model elements.
<Manufacturer>
<MnName> Mercury </MnName>
<Year> 1999 </Year>
<Model>
<AnotherModelName> Sable LT </AnotherModelName>
</Model>
</Manufacturer>
<Vehicle>
<Vendor> Scott Thomason </Vendor>
<Make> Mercury </Make>
<Model> Sable LT </Model>
<Color> Blue </Color>
</Vehicle>
2.3.1 DTD
Cannot be expressed.
2.3.2 XML-Schema
Cannot be expressed.
2.3.3 RELAX
<elementRule role='Model' label='ManufacturerModel'>
<mixed>
<element name='AnotherModelName'>
</mixed>
</elementRule>
<elementRule role='Model' label='VehicleModel'>
<mixed>
<empty/>
</mixed>
</elementRule>
<tag name='Model'/>
<elementRule role='Manufacturer'>
<ref label='ManufacturerModel'>
</elementRule>
<elementRule role='Vehicle'>
<ref label='VehicleModel'>
</elementRule>
2.3.4 SOX
Cannot be expressed.
2.3.5 Schematron
<rule context='CarDealer'>
<report test='parent::parent:Vehicle and parent::Model and AnotherModelName'>
Element AnotherModelName cannot appear inside of Model of Vehicle.
</report>
</rule>
2.3.6 DSD
<ElementDef ID="Model">
<If>
<Context>
<Element Name="Manufacturer"/>
</Context>
<Then>
<Element Name="AnotherModelName"/> <StringType IDRef="str"/> </Element>
</Then>
</If>
<StringType IDRef="str"/>
</ElementDef>
2.3.7 Comments
-
DTD cannot express context-sensitive constraints. The only way to simulate a ``similar'' behavior is
to define two different Model elements depending on the context, as in the following:
<!ELEMENT Manufacturer (MnName, Year, Model_A, Rank?)>
<!ELEMENT Vehicle (Vendor, Make, Model_B, Color?)>
<!ELEMENT Model_A (AnotherModelName)>
<!ELEMENT Model_B (#PCDATA)>
<!ELEMENT AnotherModelName (#PCDATA)>
- As well, XML-Schema cannot express context-sensitive constraints.
- In RELAX, one can define two Model roles with different labels -- ManufacturerModel and VehicleModel.
- SOX does not provide context-sensitive constraints.
- Schematron obtains the constraints by asserting that AnotherModelName cannot appear within
the Model element.
- In DSD, if Model is defined in Manufacturer, then it contains a sub-element AnotherModelName
of string type. Otherwise, Model is simply a string type element without additional embedded elements.
3 Six Query Languages
In this section, we present a comparative examples of six XML query languages
on the basis of some query examples. The first four
queries were originally proposed by David
Maier in the already cited Position Paper [18];
a preliminary version of the XML-QL examples were
originally presented by Peter Fankhauser in a message to the XML
Query language mailing list (message of Dec 22, 1998).
We believe that these queries are a good benchmark to test the expressive power
of the XML query languages.
The original documents did not contain XML attributes and IDREFs,
that have been incorporated by us for the sake of completeness. In
addition,
the set of Maier's queries has been augmented, in order to show
the potentiality of languages on all the XML components,
including XML attributes.
The original
manufacturer documents list the manufacturer's name, year, and
models with their names, front rating, side rating, and rank;
an attribute ID has been added to each manufacturer and a list
of its competitors, cross-referenced through an IDREFS attribute,
has been represented; the
vehicle documents list the vendor, make, year, color and price.
We consider the following instances of XML data:
<manufacturer ID="67878">
<mn_name>Mercury</mn_name>
<year>1999</year>
<competitors refs="67898 56676 89898"/>
<model> <mo_name>Sable LT</mo_name>
<front_rating>3.84</front_rating>
<side_rating>2.14</side_rating>
<rank>9</rank>
</model>
....
</manufacturer>
<vehicle>
<vendor>Scott Thomason</vendor>
<make>Mercury</make>
<model>Sable LT</model>
<year>1999</year>
<color>metallic blue</color>
....
<price>26800</price>
</vehicle>
3.1 Query 1: Selection and Extraction
We want to select and extract <manufacturer> elements
where some <model> has <rank> less or equal to 10.
3.1.1 Lorel
select M
from nhsc.manufacturer M
where M.model.rank <=10
3.1.2 XML-QL
WHERE <manufacturer>
<model>
<rank>$r</rank>
</model>
</manufacturer> ELEMENT_AS $m IN
www.nhsc\manufacturers.xml,
$r<=10
CONSTRUCT $m
3.1.3 XML-GL
See Figure 1:
Figure 1: Query 1
3.1.4 XSLT
<xsl:template match="/">
<xsl:for each select="manufacturer[model/rank <= 10]">
<xsl:copy-of select="." />
</xsl:for each>
</xsl:template>
3.1.5 XQL
manufacturer[model/rank<=10]
3.1.6 Quilt
FOR $m IN document("manufacturer.xml")//manufacturer,
$r IN $m/model/rank
WHERE $r LEQ 10
RETURN $m
3.1.7 Comments
-
In Lorel, the result is a collection of manufacturer object
identifiers.
- In XML-QL, the query applies to the XML
document www.nsch/manufacturers.xml. It matches every
<manufacturer> in the XML
document that has at least one
<model>, whose <rank> is less or equal to 10. The presentation
of
the result is a piece of XML document.
- In XML-GL, the query applies to the XML document
www.nsch/manufacturers.xml; it extracts all the occurrences of the
manufacturer elements satisfying the conditions stated in the
LHS side. The elements used in the RHS to construct
the result are exactly those manufacturer objects retrieved in the LHS
with all the sub-elements as appearing in the input XML documents (but
without including the elements pointed by IDREFs links). The result is a
new XML
document enclosed within the standard element result.
- In XSLT, the rule applies to the root node and
the xsl:for-each directive is
instantiated for each manufacturer node having at least one
model whose rank is less or equal to 10. Through the
xsl:copy-of instruction a node set is included in the result tree
for each selected manufacturer element.
- XQL does the job pretty concisely, having a navigation pattern
with a filter condition on the <rank>. The filter is
existentially quantified. The result is conventionally enclosed within
a standard element named XQL:result.
- Quilt implements the query by using two binding variables, one
($m) on the manufacturers to be returned, and the other ($r) on the
rank, which is checked in the predicate. The result is an ordered list
of tuples of bound variables, which is calculated as a cartesian product
of the node lists of each expression in the for clause.
- Summary: All languages cover the proposed example.
3.2 Query 2: Reduction
From the <manufacturer> elements, we want to drop those
<model>
sub-elements whose <rank> is greater than 10. We also want to elide
the
<front_rating> and <side_rating> elements from the remaining
models.
3.2.1 Lorel
select Z.mn_name, Z.year,
(select Z.model.mo_name, Z.model.rank
where Z.model.rank <= 10)
from nhsc.manufacturer Z
3.2.2 XML-QL
WHERE <manufacturer>
<mn_name>$mn</mn_name>
<year>$y</year>
</manufacturer> CONTENT_AS $m IN
www.nhsc\manufacturers.xml
CONSTRUCT
<manufacturer>
<mn_name>$mn</mn_name>
<year>$y</year>
{ WHERE <model>
<mo_name>$mon</mo_name>
<rank>$r</rank>
</model> IN $m,
$r<=10
CONSTRUCT<model>
<mo_name>$mon</mo_name>
<rank>$r</rank>
</model>
}
</manufacturer>
3.2.3 XML-GL
See Figure 2:
Figure 2: Query 2
3.2.4 XSLT
<xsl:template match="manufacturer">
<manufacturer>
<xsl:copy-of select="mn-name"/>
<xsl:copy-of select="year"/>
<xsl:for each select="model[rank <= 10]">
<xsl:copy-of select="mo-name"/>
<xsl:copy-of select="rank"/>
</xsl:for each>
</manufacturer>
</xsl:template>
3.2.5 XQL
Cannot be expressed.
3.2.6 Quilt
FOR $m IN document("manufacturer.xml")//manufacturer
LET $mos :=
FOR $moe IN $m/model
WHERE $moe/rank LEQ 10
RETURN
<model>
$moe/mo_name,
$moe/rank
</model>
RETURN
<manufacturer>
$m/mn_name,
$m/year,
$mos
</manufacturer>
3.2.7 Comments
-
In Lorel the query consists of two nested subqueries one inside
the other; both are existentially quantified.
- Also in XML-QL, the job is performed by nesting two subqueries;
nesting occurs within the construct clause of the first query.
- XML-GL has no nesting, so the query
selects first the elements of <manufacturer> that do not have a
<model> sub-element with <rank> smaller than 10 and puts them
in the result; then, it selects
those
elements having at least one <model> element with suitable
<rank>
value and inserts them in the result, by including only the selected
models.
- In XSLT, the template rule matches the manufacturer elements
satisfying the condition and, then, constructs the new model elements
with the subelements mo-name and rank.
- In XQL, the query cannot be expressed neither as reduction nor as
construction, because XQL does not allow restructuring.
- In Quilt, the query is obtained by nesting two flower expressions.
The inner query entails the set of models to be returned, and binds it to a
variable declared in the let clause. The outer query is responsible for
iterating on all the manufacturer elements.
- Summary: This example indicates that current query languages
lack for one relevant feature: reduction. All languages must resort to a
solution based on the construction of the ``remainder'' of the
document, rather than eliding some elements. This is possible only if
the DTD of the document is known in advance. XQL does not support
construction, so it cannot use this solution.
3.3 Query 3: Joins
I want our query to generate pairs of <manufacturer> and
<vehicle> elements where <mn_name> = <make>,
<mo_name> =
<model> and <year> = <year>.
3.3.1 Lorel
temp:= select (M,V) as pair
from nhsc.manufacturer M, nhs.vehicle V
where M.mn_name = V. make
and M.model.mo_name = V.model
and M.year = V.year
3.3.2 XML-QL
WHERE <manufacturer>
<mn_name>$mn</mn_name>
<year>$y</year>
<model>
<mo_name>$mon</mo_name>
</model> CONTENT_AS $mo
</manufacturer> CONTENT_AS $m IN
www.nhsc\manufacturers.xml
<vehicle>
<model>$mon</model>
<year>$y</year>
<make>$mn</make>
</vehicle> CONTENT_AS $v IN www.nhsc\vehicles.xml
CONSTRUCT
<manufacturer>
<mn_name>$mn</mn_name>
<year>$y</year>
<vehiclemodel>
$mo,$v
</vehiclemodel>
</manufacturer>
3.3.3 XML-GL
See Figure 3:
Figure 3: Query 3
3.3.4 XSLT
<xsl:variable name="v" select="document('foo2.xml')//vehicle"/>
<xsl:template match="//manufacturer">
<xsl:variable name="m" select="."/>
<xsl:for-each select="$v">
<xsl:if test="(make = $m/mn_name) and (year = $m/year)">
<manufacturer>
<xsl:copy-of select="$m/mn_name"/>
<xsl:copy-of select="$m/year"/>
<vehiclemodel>
<xsl:copy-of select="."/>
</vehiclemodel>
</manufacturer>
</xsl:if>
</xsl:for-each>
</xsl:template>
3.3.5 XQL
ref("www.nhsc")//manufacturer[$mn:=mn_name] [$y:=year] [$mon:=model/mo_name]
{$mn | $y | ref("www.nhs")//vehicle[make=$mn] [year=$y] [model=$mn]}
3.3.6 Quilt
FOR $m IN document("manufacturer.xml")//manufacturer,
$v IN document("vehicle.xml")//vehicle,
$mo IN $m/model
WHERE $m/mn_name=$v/make AND
$m/year=$v/year AND
$mo/mo_name=$v/model
RETURN
<manufacturer>
$m/mm_name,
$m/year,
$mo,
$v
</manufacturer>
3.3.7 Comments
-
In Lorel, the join builds pairs of OIDs of the relevant
documents after their joins. The joined elements are accessed creating a new
entry point temp.
- In XML-QL, a new piece of XML document is created and wrapped into the
tags <vehiclemodel>, with the content of the <model> and
<vehicle> elements that match on join conditions.
- In XML-GL, the pairs from <model> and <vehicle> are
extracted and made the
sub-elements of a new element named <vehiclemodel>, which is placed
inside the <manufacturer> element.
- In XSLT, joins are expressed by using the xsl:variable directive,
instantiated for each(completare il commento).
- The new version of XQL supports joins in a preliminary implementation
and this query is now feasible. Nevertheless, some refinements are needed
to obtain the full versatility of joins: for example, one would like to bind
a variable to the root of a document and this is undefined in the current
definition.
- Quilt builds the query by using binding variables on elements to be
joined.
- Summary: Join is supported well by five of the languages. XQL
can implement joins only partially.
3.4 Query 4: Restructuring
We want our query to collect <car> elements listing their make,
model, vendor, rank, and price, in this order.
3.4.1 Lorel
select xml(car: (select X.vehicle.make, X.vehicle.model,
X.vehicle.vendor, X.manufacturer.rank,
X.vehicle.price
from temp.pair X))
3.4.2 XML-QL
WHERE <manufacturer>
<mn_name>$mn</mn_name>
<vehiclemodel>
<model>
<mo_name>$mon</mo_name>
<rank>$r</rank>
</model>
<vehicle>
<price>$p</price>
<vendor>$v</vendor>
</vehicle>
</vehiclemodel>
</manufacturer> IN www.nhsc\queryresult3.xml
CONSTRUCT
<car>
<make>$mn</make>
<mo_name>$mon</mo_name>
<vendor>$v</vendor>
<rank>$r</rank>
<price>$p</price>
</car>
3.4.3 XML-GL
See Figure 4:
Figure 4: Query 4
3.4.4 XSLT
<xsl:template match="manufacturer">
<car>
<xsl:copy-of select="vehiclemodel/vehicle/make"/>
<xsl:copy-of select="vehiclemodel/model/mo-name"/>
<xsl:copy-of select="vehiclemodel/vehicle/vendor"/>
<xsl:copy-of select="vehiclemodel/model/rank"/>
<xsl:copy-of select="vehiclemodel/vehicle/price"/>
</car>
</xsl:template>
3.4.5 XQL
Cannot be expressed.
3.4.6 Quilt
FOR $m IN document("manufacturer.xml")//manufacturer,
$v IN document("vehicle.xml")//vehicle,
$mo IN $m/model
WHERE $m/year=$v/year AND
$mo/mo_name=$v/model AND
$m/mn_name=$v/make
RETURN
<car>
$v/make,
$v/vendor,
$v/model,
$mo/rank,
$v/price
</car>
3.4.7 Comments
-
In Lorel, the elements are extracted in the order in which they
appear in the query. The query result is associated with an element
named car by invoking the function xml(car: query string).
- XML-QL deals with the ordering of the result explicitly
in the construct clause.
- In XML-GL, a new <car> element is introduced in the result and
enriched with the corresponding sub-elements, which are ordered
counterclock-wise by means of the graphic notation of marking one of
the edges.
- In XSLT, the template rule matches the manufacturer elements and
constructs in the result the new car tags filled with the appropriate
sub-elements.
- In XQL, the query cannot be written, as it includes a
construct of a new element.
- In Quilt, a new tag <car> is added to the result and binding
variables
are used to compose the new element.
- Summary: this is a less interesting query, again it shows that join,
ordering, and construction
are feasible with Lorel, XML-QL, XML-GL, XSLT and Quilt. By contrast, XQL
has no construction mechanism.
3.5 Query 5: Multiple Cross Reference
We want our query to collect <manufacturer> elements with only
their ID attributes and names and listing the ID and name of their competitors.
3.5.1 Lorel
select Z.@ID, Z.mn_name, xml(competitors: select M.@ID, M.mn_name
from manufacturer M
where M.@ID=Z.competitors.@refs)
from manufacturer Z
3.5.2 XML-QL
WHERE <manufacturer ID=$i>
<mn_name> </mn_name> ELEMENT_AS $mn
<competitors refs=$r>
WHERE <manufacturer ID=$r>
<mn_name> </mn_name> ELEMENT_AS $nam
</manufacturer>
CONSTRUCT
<manufacturer ID=$r>
$nam
</manufacturer>
</competitors> ELEMENT_AS $c
</manufacturer> IN www.nhsc\manufacturer.xml
CONSTRUCT
<manufacturer ID=$i>
$mn,
$c
</manufacturer>
3.5.3 XML-GL
See Figure 5:
Figure 5: Query 5
3.5.4 XSLT
<xsl:for-each select="manufacturer">
<manufacturer ID="@ID">
<xsl:copy-of select="./mn_name">
<xsl:for-each select="id(@refs)">
<competitor>
<xsl:attribute name="ID">
<xsl:value-of select="./@ID"/>
</xsl:attribute>
<name>
<xsl:value-of select="./mn_name"/>
</name>
</competitor>
</xsl:for-each>
</manufacturer>
</xsl:for-each>
3.5.5 XQL
manufacturer {@ID | mn_name | competitors[$i:=id(@refs)/@ID] {//manufacturer[@ID:=$i]
{ @ID | mn-name}}}
3.5.6 Quilt
document("manufacturer.xml")/manufacturer
(@ID | mn_name | (competitors/id(@refs)->(/@ID | /mn-name)))
3.5.7 Comments
-
In Lorel, a nested query is needed. Each competitor is extracted
by matching existentially the ID of each manufacturer with the set
of IDREFs of the @refs attribute.
- XML-QL exploits the nesting of queries too, with heavy usage of
binding
variables.
- In XML-GL, a new <competitor> element is introduced under each
manufacturer
and, to avoid ambiguity, the corresponding sub-elements are explicitly
bound
to their homonyms from the left-hand side of the query.
- In XSLT, the template rule matches the manufacturer elements and
constructs in the result the new manufacturer tags filled with the
appropriate
sub-elements. Two nested <xsl:for-each/> constructs are used to
implement the query.
- In XQL, the query is written using variables to join the first and
the second instance of manufacturer. The curly brakets implement the
grouping functionality.
- In Quilt, a compact path expression produces the result. The
explicit arrow operator allows to dereference the pointed
element.
- Summary: this is a query showing how all the languages manage the
IDREFs attributes.
References
- [1]
-
S. Abiteboul, R. Goldman, J. McHugh, V. Vassalos, and Y. Zhuge.
``Views for Semistructured Data''.
In Proc. of the Workshop on Management of Semistructured Data,
Tucson, Arizona, May 1997.
- [2]
-
S. Abiteboul, D. Quass, J. McHugh, J. Widom, J.Wiener, and J. Widom.
``The Lorel Query Language for Semistructured Data''.
International Journal on Digital Libraries, 1(1):68--88, Apr.
1997.
- [3]
-
P. V. Biron and A. Malhotra (Eds).
``XML Schema Part 2: Datatypes'', Oct. 2000.
http://www.w3.org/TR/xmlschema-2/.
- [4]
-
T. Bray, J. Paoli, and C. M. Sperberg-McQueen (Eds).
``Extensible Markup Language (XML) 1.0''.
2nd Edition, Oct. 2000.
http://www.w3.org/TR/2000/REC-xml-20001006.
- [5]
-
S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi, and L. Tanca.
``XML-GL: a Graphical Language for Querying and Restructuring WWW
Data''.
In Int'l World Wide Web Conf. (WWW), Toronto, Canada, May 1999.
- [6]
-
D. Chamberlin, J. Robie, and D. Florescu.
``Quilt: An XML Query Language for Heterogeneous Data Sources''.
In Int'l Workshop on the Web and Databases (WebDB), Dallas, TX,
May 2000.
- [7]
-
J. Clark (Eds).
``XML Transformations (XSLT) Version 1.0'', Nov. 1999.
http://www.w3.org/TR/xslt.
- [8]
-
A. Davidson, M. Fuchs, M. Hedin, M. Jain, J. Koistinen, C. Lloyd, M. Maloney,
and K. Schwarzhof.
``Schema for Object-Oriented XML 2.0'', Jul. 1999.
http://www.w3.org/TR/NOTE-SOX.
- [9]
-
A. Deutsch, M. F. Fernandez, D. Florescu, A. Y. Levy, and D. Suciu.
``XML-QL: A Query Language for XML''.
In WWW The Query Language Workshop (QL), Cambridge, MA, Dec,
1998.
http://www.w3.org/TR/1998/NOTE-xml-ql-19980819/.
- [10]
-
A. Deutsch, M. F. Fernandez, D. Florescu, A. Y. Levy, and D. Suciu.
``A Query Language for XML''.
In Int'l World Wide Web Conf. (WWW), Toronto, Canada, May 1999.
http://www8.org/fullpaper.html.
- [11]
-
J. Robie (Eds).
``XQL (XML Query Language)'', Aug. 1999.
http://www.ibiblio.org/xql/xql-proposal.html.
- [12]
-
R. Goldman, J. McHugh, and J. Widom.
``From Semistructured Data to XML:Migrating the Lore Data Model and
Query Language''.
In Int'l Workshop on the Web and Databases (WebDB),
Philadelphia, PA, June 1999.
- [13]
-
W3C XSL Working Group.
``The Query Language Position Paper of the XSLT Working Group''.
In WWW The Query Language Workshop (QL), Cambridge, MA, Dec.
1998.
- [14]
-
ISO/IEC.
``Information Technology -- Text and Office Systems -- Regular
Language Description for XML (RELAX) -- Part 1: RELAX Core'', 2000.
DIS 22250-1.
- [15]
-
R. Jelliffe.
``Schematron'', Oct. 2000.
http://www.ascc.net/xml/resource/schematron/.
- [16]
-
N. Klarlund, A. Moller, and M. I. Schwatzbach.
``Document Structure Description 1.0'', 1999.
http://www.brics.dk/DSD/.
- [17]
-
N. Klarlund, A. Moller, and M. I. Schwatzbach.
``DSD: A Schema Language for XML''.
In ACM SIGSOFT Workshop on Formal Methods in Software Practice,
Portland, OR, Aug. 2000.
- [18]
-
D. Maier.
``Database desiderata for an XML Query Language''.
In WWW The Query Language Workshop (QL), Cambridge, MA, Dec.
1998.
- [19]
-
N. Miloslav.
``Schematron Tutorial'', May 2000.
http://www.zvon.org/HTMLonly/
SchematronTutorial/General/contents.html.
- [20]
-
M. Murata.
``RELAX (REgular LAnguage description for XML)'', Aug. 2000.
http://www.xml.gr.jp/relax/.
- [21]
-
J. Robie.
``The design of XQL'', 1999.
http://www.w3.org/Style/XSLT/Group/1998/09/XQL-design.html.
- [22]
-
J. Robie.
``XQL FAQ'', 1999.
http://metalab.unc.edu/xql/.
- [23]
-
J. Robie.
``XQL Tutorial'', 1999.
http://metalab.unc.edu /xql/xql-tutorial.html.
- [24]
-
J. Robie, J. Lapp, and D. Schach.
``XML Query Language (XQL)''.
In WWW The Query Language Workshop (QL), Cambridge, MA, Dec.
1998.
- [25]
-
D. Schach, J. Lapp, and J. Robie.
``Querying and Transforming XML''.
In WWW The Query Language Workshop (QL), Cambridge, MA, Dec.
1998.
- [26]
-
H. S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn (Eds).
``XML Schema Part 1: Structures'', Oct. 2000.
http://www.w3.org/TR/xmlschema-1/.
This document was translated from LATEX by
HEVEA.