Java Programming Language

Monday, May 31, 2004

XQuery Intro

XQuery Intro

Declerative Vs Descriptive
This difference is often summarized by saying that query languages are
declarative (stating what you want), while programming languages are
descriptive (stating how you want it done). The difference is subtle, but
significant.

XPath 1.0 introduced a convenient syntax for addressing parts of an XML
document. If you need to select a node out of an existing XML document or
database, XPath is the perfect choice, and XQuery doesn't change that.

XSLT 1.0 (which was developed at the same time as XPath) takes XML querying
a step further, including XPath 1.0 as a subset to address parts of an XML
document and then adding many other features. XSLT is fantastic for
recursively processing an XML document or translating XML into HTML and
text. XSLT can create new XML or (copy) part of existing nodes, and it can
introduce variables and namespaces.

Finally, XSLT 1.0 encourages and often requires users to solve problems in
unnatural ways. XSLT is inherently recursive, but most programmers today
think procedurally; we think of calling functions directly ourselves, not
having functions called for us in an event-driven fashion whenever a match
occurs. Many people write large XSLT queries using only a single
rule, apparently unaware that XSLT's recursive matching
capabilities would cut their query size in half and make it much easier to
maintain.

XQuery also supports a really important feature that was purposely disabled
in XSLT 1.0, something commonly known as composition. Composition allows
users to construct temporary XML results in the middle of a query, and then
navigate into that. This is such an important feature that many vendors
added extension functions, such as nodeset() to XSLT 1.0, to support it
anyway; XQuery makes it a first-class operation

XQuery uses XML Schema 1.0 as the basis for its type system. Consequently,
these two standards share some terminology and definitions. XQuery also
provides some operators such as import schema and validate to support
working with XML schemas.

Every XQuery expression has a static type (compile-time) and a dynamic type
(run-time). The dynamic type applies to the actual value that results when
the expression is evaluated; the value is an instance of that dynamic type.
The static type applies to the expression itself, and can be used to perform
type checking during compilation. All XQuery implementations perform dynamic
type checking, but only some perform static type checking.

Every XQuery value is a sequence containing zero or more items. Each
individual item in a sequence is a singleton, and is the same as a sequence
of length one containing just that item. Consequently, sequences are never
nested.

Every singleton item in XQuery has a type derived from item(). The item()
type is similar to the object type in Java and C#, except that it is
abstract: you can't create an instance of item(). (It's written with
parentheses in part to avoid confusion with user-defined types with the same
name and in part to be consistent with the XPath node tests.)

items are classified into two kinds: XML nodes and atomic values. Nodes
derive from the type node(), and atomic values derive from
xdt:anyAtomicType. Like item(), the node() and xdt:anyAtomicType types are
abstract.

All of the atomic type names are in one of two namespaces: The XML Schema
type names are in the XML Schema namespace http://www.w3.org/2001/XMLSchema,
which is bound to the prefix xs. The XQuery type names are in the XQuery
type namespace http://www.w3.org/2003/11/xpath-datatypes, which is bound to
the prefix xdt. These prefixes are built in to XQuery.

Every XQuery expression evaluates to a sequence (a single item is equivalent
to a sequence of length one containing that item). Items in a sequence can
be atomic values or nodes. Collectively, these make up the XQuery Data
Model.

XQuery comments begin with the two characters (: and end with the two
characters :)

Every query begins with an optional section called the prolog. The prolog
sets up the compile-time context for the rest of the query, including things
like default namespaces, in-scope namespaces, user-defined functions,
imported schema types, and even external variables and functions (if the
implementation supports them). Each prolog statement must end with a
semicolon (;).

Each function definition starts with the keywords declare function, followed
by the name of the function, the names of its parameters (if any) and
optionally their types, optionally the return type of the function, and
finally the body of the function (enclosed in curly braces). Ex

declare function my:fact ($n as xs:integer) as xs:integer
{
if ( $n < 2 )
then 1
else
$n * my:fact($n - 1)
};

Queries may be divided into separate modules. Each module is a
self-contained unit, analogous to a file containing code. Modules are most
commonly used to define function libraries, which can then be shared by many
queries using the import module statement in the prolog. Note that not every
implementation supports modules.

XQuery expressions may be embedded in XML constructors

It is { true() or false() } that this is an example.

=>
It is true that this is an example.

Sequence content is flattened before inserting into XML

All of the built-in functions (except type constructors) belong to the
namespace http://www.w3.org/2003/11/xpath-functions, which is bound to the
prefix fn. This is also the default namespace for functions, which means
that unqualified function names are matched against the built-in functions.
For example, true() is the same as fn:true(), provided that you haven't
changed the default function namespace or the namespace binding for fn.

Operators
true() and false() => false()
true() or false() => true()
not(false()) => true()

if (expr < 0)
then "negative"
else if (expr > 0)
then "positive"
else "zero"

string-length("abcde") => 5
substring("abcde", 3) => "cde"
substring("abcde", 2, 3) => "bcd"
concat("ab", "cd", "", "e") => "abcde"
string-join(("ab","cd","","e"), "") => "abcde"
string-join(("ab","cd","","e"), "x") => "abxcdxxe"
contains("abcde", "e") => true
replace("abcde", "a.*d", "x") => "xe"
replace("abcde", "([ab][cd])+", "x") => "axde"
normalize-space(" a b cd e ") => "a b cd e"

1 eq 1 => true
1 eq 2 => false
1 ne 2 => true
1 gt 2 => false
1 lt 2 => true

Finally, there are three node comparison operators: <<, >>, and is. The node
comparison operators depend on node identity and document order. The is
operator returns true if two nodes are the same node by identity. The <<
operator is pronounced "before" and tests whether a node occurs before
another one in document order. Similarly, the >> operator is pronounced
"after" and tests whether a node occurs after another one in document order.

Variables in XQuery are written using a dollar sign symbol in front of a
name, like so: $variable. The variable name may consist of only a local-name
like this one, or it may be a qualified name consisting of a prefix and
local-name, like $prefix:local. In this case, it behaves like any other XML
qualified name. (The prefix must be bound to a namespace in scope, and it is
the namespace value that matters, not the prefix.)

The central expression in XQuery is the so-called "flower expression," named
after the first letters of its clauses-for, let, where, order by,
return-FLWOR.

A typical FLWOR expression

for $i in doc("orders.xml")//Customer
let $name := concat($i/@FirstName, $i/@LastName)
where $i/@ZipCode = 91126
order by $i/@LastName
return

{ $i//Order }

FLWOR can introduce variables into scope
FLWOR is also useful for filtering sequences

Sort employee names by last name

for $e in doc("team.xml")//Employee
let $name := $e/Name
order by tokenize($name)[2] (: Extract the last name :)
return $name

Joining two documents together

for $i in doc("one.xml")//fish,
$j in doc("two.xml")//fish
where $i/red = $j/blue
return { $i, $j }

XQuery distinguishes between static errors that may occur when compiling a
query and dynamic errors that may occur when evaluating a query. Dynamic
errors may be reported statically if they are detected during compilation
(for example, xs:decimal("X") may result in either a dynamic or a static
error, depending on the implementation).

Most XQuery expressions perform extensive type checking. For example, the
addition $a + $b results in an error if either $a or $b is a sequence
containing more than one item, or if the two values cannot be added
together. For example, "1" + 2 is an error. This is very different from
XPath and XSLT 1.0, in which "1" + 2 converted the string to a number, and
then performed the addition without error.

XQuery also defines a built-in error() function that takes an optional
argument (the error value) and raises a dynamic error. In addition, some
implementations support the trace() function, which allows you to generate a
message without terminating query execution. See Appendix C for examples.

Many other XQuery operations may cause dynamic errors, such as type
conversion errors. As mentioned previously, often implementations are
allowed to evaluate expressions in any order or to optimize out certain
temporary expressions. Consequently, an implementation may optimize out some
dynamic errors. For example, error() and false() might raise an error, or
might return false. The only expressions that guarantee a particular
order-of-evaluation are if/then/else and typeswitch.

# posted by Manu Anand : 4:50 PM

Comments: Post a Comment

Java Programming Language

XQuery Intro

Links

Archives