Xpath syntax format summary _XML/RSS_ Scripting home

Summary of Xpath syntax format

Updated: Oct 12, 2017 10:52:15 Author: Soul Light
This article mainly introduces the related content of Xpath syntax format summary, more comprehensive, here to share with you, if there are shortcomings welcome to supplement.

I often use the relevant knowledge of XPath in my work, but every time I always don't remember or understand some key places, so it is inevitable that I always have to look up some piecemeal knowledge every time, feeling that it is very annoying and a waste of time, so I summarize and summarize XPath.

In this article you will learn:

Introduction to XPath

Detailed description of the XPath path expression

XPath in DOM, XSLT, and XQuery

Introduction to XPath

XPath is a W3C standard. Its primary purpose is to locate nodes in the XML1.0 or XML1.1 document node tree. There are currently two versions of XPath1.0 and XPath2.0. Xpath1.0 became a W3C standard in 1999, and XPath2.0 was established in 2007. Detailed documentation of the W3C XPath English please see: http://www.w3.org/TR/xpath20/.

XPath is an expression language, and its return value may be a node, a collection of nodes, atomic values, or a mixture of nodes and atomic values. XPath2.0 is a superset of XPath1.0. It is an extension of XPath1.0, it can support more rich data types, and XPath2.0 maintains relatively good backward compatibility with XPath1.0, almost all XPath2.0 results can remain the same as XPath1.0. XPath2.0 is also the primary expression language for querying and locating nodes from XSLT2.0 and XQuery1.0. XQuery1.0 is an extension to XPath2.0. Knowledge of using XPath expressions to locate nodes in XSLT and XQuery is covered in later examples.

Before learning XPath, you should learn about XML nodes, elements, attributes, atomic values (text), processing instructions, comments, root nodes (document nodes), namespaces, and relationships between nodes such as: The concepts of Parent, Children, Sibling, Ancestor, Descendant and so on are understood. There is no explanation here.

XPath path expression

In the following sections you will learn:

Path expression syntax

Relative/absolute path

Expression context

Concepts of predicates (filter expressions) and axes

Operators and special characters

Common expression instance

Function and description

An example Xml file is given here. The following instructions and examples are based on this XML file.

<? xml version="1.0" encoding="UTF-8"? > <! -- edited with XMLSpy v2008 rel. 2 sp2 (http://www.altova.com) by Administrator --> <?xml-stylesheet type="text/xsl" href="messages.xsl" rel="external nofollow" ?> <messages> <message id="1"> <sender>gukaitong@gmail.com</sender> <to>anonymous@gmail.com <group name="IT"> <address>111@gmail.com</address> <address>222@gmail.com</address> <address>aaa@gmail.com</address> <address>bbb@gmail.com</address> <address>ccc@gmail.com</address> </group> </to> <subject>This is a sample</subject> <datetime date="2008-12-11" time="12:00:00" formatted="12/11/2008 12:00AM">2008-12-11T12:00:00Z</datetime> <body> Are you interested in? <attachments> <attachment id="1"> <message id="0"> <sender>anonymous@gmail.com</sender> <to>gukaitong@gmail.com</to> <body> We strongly recommend the following books <books xmlns:amazon="http://www.amazon.com/books/schema"> <amazon:book> <name>Professional C# 2008 </name> <country>USA</country> <price>37.79</price> <year>2007</year> </amazon:book> <amazon:book> <name>Microsoft Visual C# 2008 Step by Step </name> <country>USA</country> <price>26.39 </price> <year>2008</year> </amazon:book> <amazon:book> <name>C# in Depth</name> <country>USA</country> <price>29.69 </price> <year>2006</year> </amazon:book> <amazon:book> <name>Thinking in Java</name> <country>USA</country> <price>23.69 </price> <year>2004</year> </amazon:book> </books> </body> </message> </attachment> </attachments> </body> </message> <message id="2"> <sender>333@gmail.com</sender> <to>444@gmail.com</to> <subject>No title</subject> <body/> </message> </messages>

Path expression syntax:

Path = relative path | absolute path

XPath path expression = step expression | Relative path "/" step expression.

Step expression = Axis node test predicate

Instructions:

Where the axis represents the tree relationship (hierarchical relationship) between the node selected by the stepping expression and the current context node, the node test specifies the node name extension selected by the stepping expression, and the predicate is equivalent to the filter expression to further filter the refined node set.

The predicate can be 0 or more. Multiple multiple predicates are concatenated with logical operators and, or. Take logic instead of using the not() function.

Look at a typical XPath query expression: / messages/message / / child: : node () [@ id = 0], including/messages/message is a path (absolute path begins with a "/"), the child: : is the axis choice under the child nodes, node () is a node test according to select all of the nodes. [@id=0] is a predicate that selects all nodes that have attribute ids and the value is 0.

Relative and absolute paths:

If the "/" is at the beginning of the XPath expression, it represents the document root element (the middle of the expression is used as a separator to separate each step expression).

Such as: / messages/message/subject is an absolute path representation, it shows that began to find the document root node. Assuming that the current node is in the first message node [/messages/message[1]], the path expression subject (without "/" before the path) is called a relative path, indicating that the search starts from the current node. See "Expression Context" below for details.

Expression Context:

Context actually represents an environment. To clarify the context in which the current XPath path expression is executed. For example, the same path expression may produce completely different results when executed in the context of operations on the root node than when executed in the context of operations on a particular child node. That is, the evaluation of an XPath path expression depends on its context.

There are several basic XPath contexts:

Current node (./) :

For example./sender means to select the set of sender nodes under the current node (equivalent to the "specific elements" described below, such as: sender) parent node (... /) :

Such as... /sender Indicates the set of sender nodes under the parent node of the current node

Root element (/) :

Such as /messages indicates the collection of messages nodes selected from the document root node.

Root node (/*) :

The * here represents all nodes, but there is only one root element, so this represents the root node. The result returned by /* is the same as that returned by /messages

messages node.

Recursive descent (//) :

For example, the current context is a messages node. //sender returns the following result:

/messages//sender :
<sender>gkt1980@gmail.com</sender>
<sender>111@gmail.com</sender>
<sender>333@gmail.com</sender>
/messages/message[1]//sender:
<sender>gkt1980@gmail.com</sender>
<sender>111@gmail.com</sender>

We can see that the result of the XPath expression is: starting from the current node, recursive step to search all the children under the current node to find the node set that satisfies the condition.

Specific element

For example, sender: indicates that the set of sender nodes under the current node is selected, equivalent to (./sender).

Note: It is important to be aware of context when executing XPath. That is, under which node the XPath expression is currently executed. This is important in XMLDOM. Such as: in XMLDOM selectNodes, selectSingleNode is an XPath expression method of parameters, the XPath expression at this time of the execution context is called the method of node and it's environment. For more information, see: http://www.w3.org/TR/xpath20/

Concepts of predicates (filter expressions) and axes:

The predicate of XPath is a filter expression, similar to the where clause of SQL.

Axis name

result

ancestor

Select all ancestors of the current node (father, grandfather, etc.)

ancestor-or-self

Select all ancestors of the current node (father, grandfather, etc.) and the current node itself

attribute

Selects all properties of the current node

child

Selects all child elements of the current node.

descendant

Selects all descendant elements (child, child, etc.) of the current node.

descendant-or-self

Selects all descendant elements of the current node (children, grandchildren, etc.) and the current node itself.

following

Selects all nodes in the document after the end label of the current node.

namespace

Selects all namespace nodes for the current node

parent

Select the parent of the current node.

preceding

Up to all parent nodes of this node, select all sibling nodes before each parent node in order

preceding-sibling

Select all sibling nodes before the current node.

self

Select the current node.

Operators and special characters:

Operator/special character

Instructions

/

When this path operator appears at the beginning of the pattern, it indicates that it should be selected from the root node.

//

Descending recursively from the current node, this path operator appears at the beginning of the pattern, indicating that it should descend recursively from the root node.

.

The current context.

..

Current context node parent.

*

Wildcard character; Selecting all element nodes is independent of the element name. (Does not include nodes such as text, comments, instructions, etc. If you want to include these nodes, use the node() function)

@

The prefix of the attribute name.

@ *

Select all properties, regardless of name.

:

Namespace separator; Separate the namespace prefix from the element name or attribute name.

()

The bracket operator (highest priority), which enforces the priority of the operation.

[]

Apply filter patterns (i.e., predicates, including Filter expressions and axes (forward/backward)).

[]

Subscript operator; Used to index a collection.

|

Union of two node sets, such as //messages/message/to | //messages/message/cc

-

Subtraction.

div,

Floating point division.

and, or

Logical operation.

mod

Complement.

not()

Logical negation

=

Equal to

! =

Not equal to

Special comparison operator

< or

<= or &lt; =

> 或者 &gt;

>= or &gt; =

Escaping must be used when needed, as in XSLT, and is not required in XMLDOM scripting.

Examples of common expressions:

/

Document Root Indicates the document root.

/ *

Select all element nodes below the document root, the root node (XML documents have only one root node)

/node()

All nodes under the root element (including text nodes, comment nodes, etc.)

/text()

Finds all text nodes under the document root node

/messages/message

All message nodes under the messages node

/messages/message[1]

The first message node under the messages node

/messages/message[1]/self::node()

The first message node(self axis for itself, node() for selecting all nodes)

/messages/message[1]/node()

All children under the first message node

/messages/message[1]/*[last()]

The last child of the first message node

/messages/message[1]/[last()]

Error, the predicate must be preceded by a node or set of nodes

/messages/message[1]/node()[last()]

The last child of the first message node

/messages/message[1]/text()

All children of the first message node

/messages/message[1]//text()

The first message node recursively descends to find all text nodes (infinite depth)

/messages/message[1] /child::node()

/messages/message[1] /node()

/messages/message[position()=1]/node()

//message[@id=1] /node()

All children under the first message node

//message[@id=1] //child::node()

Recursively all children (infinite depth)

//message[position()=1]/node()

Select the message node with id=1 and the message node with id=0

/messages/message[1] /parent::*

Messages node

/messages/message[1]/body/attachments/parent::node()

/messages/message[1]/body/attachments/parent::* /messages/message[1]/body/attachments/..

Parent of the attachments node. There is only one parent, so node() returns the same result as *.

(...) Also denotes parent node. Denotes own node)

//message[@id=0]/ancestor::*

The Ancestor axis represents all the ancestors, fathers, grandfathers, etc.

Upward recursion

//message[@id=0]/ancestor-or-self::*

Recursion up, including itself

//message[@id=0]/ancestor::node()

One more Document root element than *.

/messages/message[1]/descendant::node()

//messages/message[1]//node()

Recursive descent finds all nodes of the message node

/messages/message[1]/sender/following::*

Find all sibling nodes after the sender node of the first message node, and recursively look down for each sibling node.

//message[@id=1]/sender/following-sibling::*

Find all subsequent sibling nodes of the sender node of the message node whose id=1.

//message[@id=1]/datetime/@date

Finds the date attribute of the datetime node of the message node with id=1

//message[@id=1]/datetime[@date]

//message/datetime[attribute::date]

Find all datetime nodes with the date attribute for the message node whose id=1

//message[datetime]

Finds all message nodes that contain datetime nodes

//message/datetime/attribute::*

//message/datetime/attribute::node()

//message/datetime/@*

Returns all attribute nodes of the datetime node under the message node

//message/datetime[attribute::*]

//message/datetime[attribute::node()]

//message/datetime[@*]

//message/datetime[@node()]

Select all datetime nodes that contain attributes

//attribute::*

Select all property nodes under the root node

//message[@id=0]/body/preceding::node()

Select all sibling nodes before the node where the body node resides in sequence. (The search order is: first find the top-level node (root node) of the body node, get all the sibling nodes before the root node label, continue to the next level after the execution is completed, get all the sibling nodes before the node label, and so on.)

Note: Finding sibling nodes is a sequential search, not a recursive search.

//message[@id=0]/body/preceding-sibling::node()

Find all sibling nodes before the body tag in order. One of the biggest differences from the above example is that it does not start from the top layer to the body node. We can understand that there is one less loop, and only the sibling node before the current node is found)

//message[@id=1]//*[namespace::amazon]

Find all nodes with amazon namespaces under all message nodes with id=1.

//namespace::*

All namespace nodes in the document. (Including the default namespace xmlns:xml)

//message[@id=0]//books/*[local-name()='book']

Select all the book nodes under books,

Note: Because the book node defines the namespace <amazone:book>. If written as //message[@id=0]//books/book, no node can be found.

//message[@id=0]//books/*[local-name()='book' and namespace-uri()='http://www.amazon.com/books/schema']

Select all the book nodes under books (node names and namespaces match)

//message[@id=0]//books/*[local-name()='book'][year>2006]

Select the book node whose value is year >2006

//message[@id=0]//books/*[local-name()='book'][1]/year>2006

Indicates whether the year node value of the first book node is greater than 2006.

Return xs:boolean: true

Function and description:

The good news is that XPath functions share a library with XSLT, XQuery, etc. The library provides us with feature-rich calls to various functions, and we can also customize our own functions. Here is no longer usage of each function one by one, Chinese can refer to this website, https://www.jb51.net/w3school/xpath/index.htm

XPath in DOM,XSLT, and XQuery

DOM:

<! PUBLIC DOCTYPE HTML "- / / / / W3C DTD XHTML 1.0 Transitional / / EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" > <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>XPath Test</title> </head> <body> <script language="javascript" type="text/javascript"> var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.load("messages.xml"); xmlDoc.setProperty("SelectionLanguage", "XPath"); var sPath = "/messages/message[1]//books/*[local-name()='book']"; var bookNodes = xmlDoc.selectNodes(sPath); document.write("<ul>"); for ( var i = 0; i < bookNodes.length; i++) { document.write("<li>" + bookNodes[i].childNodes[0].text + "</li>"); } document.write("</ul>"); </script> </body> </html>

Attention:

If we use new ActiveXObject(" Microt.xmlDOM "), we need to note that the SelectionLanguage property of the earlier XMLDOM is a regular expression by default, not an XPath language. So you need to specify a statement like xmlDoc.setProperty("SelectionLanguage", "XPath"); To support XPath query expressions. .

If the SelectionLanguage attribute value is not specified as XPath, note the following:

Array subscripts starting at 0 (we know that array subscripts start at 1 in XPath query expressions) do not support the use of XPath functions in XPath query expressions.

Sum up

This is all the content of this article on the Xpath syntax format summary, I hope to help you. Interested friends can refer to: MYSQL updatexml() function error injection analysis, OGNL expression basic syntax and usage details, front-end common cross-domain solutions (full), any questions can be left at any time, welcome to discuss the exchange.

Related article

Latest comments