std.xml
Classes and functions for creating and parsing XML
The basic architecture of this module is that there are standalone functions,
classes for constructing an XML document from scratch (Tag, Element and
Document), and also classes for parsing a pre-existing XML file (ElementParser
and DocumentParser). The parsing classes
may be used to build a
Document, but that is not their primary purpose. The handling capabilities of
DocumentParser and ElementParser are sufficiently customizable that you can
make them do pretty much whatever you want.
Example:
This example creates a DOM (Document Object Model) tree
from an XML file.
import std.xml;
import std.stdio;
import std.string;
void main()
{
string s = cast(string)std.file.read("books.xml");
check(s);
auto doc = new Document(s);
writefln(doc);
}
Example:
This example does much the same thing, except that the file is
deconstructed and reconstructed by hand. This is more work, but the
techniques involved offer vastly more power.
import std.xml;
import std.stdio;
import std.string;
struct Book
{
string id;
string author;
string title;
string genre;
string price;
string pubDate;
string description;
}
void main()
{
string s = cast(string)std.file.read("books.xml");
check(s);
Book[] books;
auto xml = new DocumentParser(s);
xml.onStartTag["book"] = (ElementParser xml)
{
Book book;
book.id = xml.tag.attr["id"];
xml.onEndTag["author"] = (in Element e) { book.author = e.text(); };
xml.onEndTag["title"] = (in Element e) { book.title = e.text(); };
xml.onEndTag["genre"] = (in Element e) { book.genre = e.text(); };
xml.onEndTag["price"] = (in Element e) { book.price = e.text(); };
xml.onEndTag["publish-date"] = (in Element e) { book.pubDate = e.text(); };
xml.onEndTag["description"] = (in Element e) { book.description = e.text(); };
xml.parse();
books ~= book;
};
xml.parse();
auto doc = new Document(new Tag("catalog"));
foreach(book;books)
{
auto element = new Element("book");
element.tag.attr["id"] = book.id;
element ~= new Element("author", book.author);
element ~= new Element("title", book.title);
element ~= new Element("genre", book.genre);
element ~= new Element("price", book.price);
element ~= new Element("publish-date",book.pubDate);
element ~= new Element("description", book.description);
doc ~= element;
}
writefln(join(doc.pretty(3),"\n"));
}
License:Boost License 1.0.
Authors:Janice Caron
Source:
std/xml.d
- Returns true if the character is a character according to the XML standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
- Returns true if the character is whitespace according to the XML standard
Only the following characters are considered whitespace in XML - space, tab,
carriage return and linefeed
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
- Returns true if the character is a digit according to the XML standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
- Returns true if the character is a letter according to the XML standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
bool
isIdeographic(dchar
c);
- Returns true if the character is an ideographic character according to the
XML standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
bool
isBaseChar(dchar
c);
- Returns true if the character is a base character according to the XML
standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
bool
isCombiningChar(dchar
c);
- Returns true if the character is a combining character according to the
XML standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
bool
isExtender(dchar
c);
- Returns true if the character is an extender according to the XML standard
Standards:
XML 1.0
Parameters:dchar c |
the character to be tested |
- Encodes a string by replacing all characters which need to be escaped with
appropriate predefined XML entities.
encode() escapes certain characters (ampersand, quote, apostrophe, less-than
and greater-than), and similarly, decode() unescapes them. These functions
are provided for convenience only. You do not need to use them when using
the std.xml classes, because then all the encoding and decoding will be done
for you automatically.
If the string is not modified, the original will be returned.
Standards:
XML 1.0
Parameters:s |
The string to be encoded |
Returns:
The encoded string
Examples:
writefln(encode("a > b"));
- Mode to use for decoding.
NONE
Do not decode
LOOSE
Decode, but ignore errors
STRICT
Decode, and throw exception on error
string
decode(string
s, DecodeMode
mode = (DecodeMode).LOOSE);
- Decodes a string by unescaping all predefined XML entities.
encode() escapes certain characters (ampersand, quote, apostrophe, less-than
and greater-than), and similarly, decode() unescapes them. These functions
are provided for convenience only. You do not need to use them when using
the std.xml classes, because then all the encoding and decoding will be done
for you automatically.
This function decodes the entities &, ", ',
< and >,
as well as decimal and hexadecimal entities such as €
If the string does not contain an ampersand, the original will be returned.
Note that the "mode" parameter can be one of DecodeMode.NONE (do not
decode), DecodeMode.LOOSE (decode, but ignore errors), or DecodeMode.STRICT
(decode, and throw a DecodeException in the event of an error).
Standards:
XML 1.0
Parameters:string s |
The string to be decoded |
DecodeMode mode |
(optional) Mode to use for decoding. (Defaults to LOOSE). |
Throws:
DecodeException if mode == DecodeMode.STRICT and decode fails
Returns:
The decoded string
Examples:
writefln(decode("a > b"));
class
Document: std.xml.Element;
- Class representing an XML document.
Standards:
XML 1.0
- Contains all text which occurs before the root element.
Defaults to <?xml version="1.0"?>
- Contains all text which occurs after the root element.
Defaults to the empty string
this(string s);
- Constructs a Document by parsing XML text.
This function creates a complete DOM (Document Object Model) tree.
The input to this function MUST be valid XML.
This is enforced by DocumentParser's in contract.
Parameters:
string s |
the complete XML text. |
this(const(Tag) tag);
- Constructs a Document from a Tag.
Parameters:
const(Tag) tag |
the start tag of the document. |
const const bool
opEquals(Object
o);
- Compares two Documents for equality
Examples:
Document d1,d2;
if (d1 == d2) { }
const const int
opCmp(Object
o);
- Compares two Documents
You should rarely need to call this function. It exists so that
Documents can be used as associative array keys.
Examples:
Document d1,d2;
if (d1 < d2) { }
const const nothrow @trusted hash_t
toHash();
- Returns the hash of a Document
You should rarely need to call this function. It exists so that
Documents can be used as associative array keys.
const const string
toString();
- Returns the string representation of a Document. (That is, the
complete XML of a document).
class
Element: std.xml.Item;
- Class representing an XML element.
Standards:
XML 1.0
- The start tag of the element
- The element's items
- The element's text items
- The element's CData items
- The element's comments
ProcessingInstruction[]
pis;
- The element's processing instructions
- The element's child elements
this(string name, string interior = null);
- Constructs an Element given a name and a string to be used as a Text
interior.
Parameters:
string name |
the name of the element. |
string interior |
(optional) the string interior. |
Examples:
auto element = new Element("title","Serenity")
this(const(Tag) tag_);
- Constructs an Element from a Tag.
Parameters:
tag |
the start or empty tag of the element. |
void
opCatAssign(Text
item);
- Append a text item to the interior of this element
Parameters:
Text item |
the item you wish to append. |
Examples:
Element element;
element ~= new Text("hello");
void
opCatAssign(CData
item);
- Append a CData item to the interior of this element
Parameters:
CData item |
the item you wish to append. |
Examples:
Element element;
element ~= new CData("hello");
void
opCatAssign(Comment
item);
- Append a comment to the interior of this element
Parameters:
Comment item |
the item you wish to append. |
Examples:
Element element;
element ~= new Comment("hello");
void
opCatAssign(ProcessingInstruction
item);
- Append a processing instruction to the interior of this element
Parameters:
ProcessingInstruction item |
the item you wish to append. |
Examples:
Element element;
element ~= new ProcessingInstruction("hello");
void
opCatAssign(Element
item);
- Append a complete element to the interior of this element
Parameters:
Element item |
the item you wish to append. |
Examples:
Element element;
Element other = new Element("br");
element ~= other;
- Compares two Elements for equality
Examples:
Element e1,e2;
if (e1 == e2) { }
- Compares two Elements
You should rarely need to call this function. It exists so that Elements
can be used as associative array keys.
Examples:
Element e1,e2;
if (e1 < e2) { }
nothrow @safe hash_t
toHash();
- Returns the hash of an Element
You should rarely need to call this function. It exists so that Elements
can be used as associative array keys.
const const string
text(DecodeMode
mode = (DecodeMode).LOOSE);
- Returns the decoded interior of an element.
The element is assumed to containt text only. So, for
example, given XML such as "<title>Good &
Bad</title>", will return "Good & Bad".
Parameters:
DecodeMode mode |
(optional) Mode to use for decoding. (Defaults to LOOSE). |
Throws:
DecodeException if decode fails
const const string[]
pretty(uint
indent = 2);
- Returns an indented string representation of this item
Parameters:
uint indent |
(optional) number of spaces by which to indent this
element. Defaults to 2. |
const const string
toString();
- Returns the string representation of an Element
Examples:
auto element = new Element("br");
writefln(element.toString());
- Tag types.
START
Used for start tags
END
Used for end tags
EMPTY
Used for empty tags
- Class representing an XML tag.
Standards:
XML 1.0
The class invariant guarantees
- that type is a valid enum TagType value
- that name consists of valid characters
- that each attribute name consists of valid characters
- Type of tag
- Tag name
- Associative array of attributes
this(string name, TagType type = cast(TagType)0);
- Constructs an instance of Tag with a specified name and type
The constructor does not initialize the attributes. To initialize the
attributes, you access the attr member variable.
Parameters:
string name |
the Tag's name |
TagType type |
(optional) the Tag's type. If omitted, defaults to
TagType.START. |
Examples:
auto tag = new Tag("img",Tag.EMPTY);
tag.attr["src"] = "http://example.com/example.jpg";
const const bool
opEquals(Object
o);
- Compares two Tags for equality
You should rarely need to call this function. It exists so that Tags
can be used as associative array keys.
Examples:
Tag tag1,tag2
if (tag1 == tag2) { }
const const int
opCmp(Object
o);
- Compares two Tags
Examples:
Tag tag1,tag2
if (tag1 < tag2) { }
const const nothrow @safe hash_t
toHash();
- Returns the hash of a Tag
You should rarely need to call this function. It exists so that Tags
can be used as associative array keys.
const const string
toString();
- Returns the string representation of a Tag
Examples:
auto tag = new Tag("book",TagType.START);
writefln(tag.toString());
const const @property bool
isStart();
- Returns true if the Tag is a start tag
Examples:
if (tag.isStart) { }
const const @property bool
isEnd();
- Returns true if the Tag is an end tag
Examples:
if (tag.isEnd) { }
const const @property bool
isEmpty();
- Returns true if the Tag is an empty tag
Examples:
if (tag.isEmpty) { }
class
Comment: std.xml.Item;
- Class representing a comment
this(string content);
- Construct a comment
Parameters:
string content |
the body of the comment |
Throws:
CommentException if the comment body is illegal (contains "--"
or exactly equals "-")
Examples:
auto item = new Comment("This is a comment");
- Compares two comments for equality
Examples:
Comment item1,item2;
if (item1 == item2) { }
- Compares two comments
You should rarely need to call this function. It exists so that Comments
can be used as associative array keys.
Examples:
Comment item1,item2;
if (item1 < item2) { }
nothrow @safe hash_t
toHash();
- Returns the hash of a Comment
You should rarely need to call this function. It exists so that Comments
can be used as associative array keys.
const const string
toString();
- Returns a string representation of this comment
const const @property bool
isEmptyXML();
- Returns false always
class
CData: std.xml.Item;
- Class representing a Character Data section
this(string content);
- Construct a chraracter data section
Parameters:
string content |
the body of the character data segment |
Throws:
CDataException if the segment body is illegal (contains "]]>")
Examples:
auto item = new CData("<b>hello</b>");
- Compares two CDatas for equality
Examples:
CData item1,item2;
if (item1 == item2) { }
- Compares two CDatas
You should rarely need to call this function. It exists so that CDatas
can be used as associative array keys.
Examples:
CData item1,item2;
if (item1 < item2) { }
nothrow @safe hash_t
toHash();
- Returns the hash of a CData
You should rarely need to call this function. It exists so that CDatas
can be used as associative array keys.
const const string
toString();
- Returns a string representation of this CData section
const const @property bool
isEmptyXML();
- Returns false always
class
Text: std.xml.Item;
- Class representing a text (aka Parsed Character Data) section
this(string content);
- Construct a text (aka PCData) section
Parameters:
string content |
the text. This function encodes the text before
insertion, so it is safe to insert any text |
Examples:
auto Text = new CData("a < b");
- Compares two text sections for equality
Examples:
Text item1,item2;
if (item1 == item2) { }
- Compares two text sections
You should rarely need to call this function. It exists so that Texts
can be used as associative array keys.
Examples:
Text item1,item2;
if (item1 < item2) { }
nothrow @safe hash_t
toHash();
- Returns the hash of a text section
You should rarely need to call this function. It exists so that Texts
can be used as associative array keys.
const const string
toString();
- Returns a string representation of this Text section
const const @property bool
isEmptyXML();
- Returns true if the content is the empty string
class
XMLInstruction: std.xml.Item;
- Class representing an XML Instruction section
this(string content);
- Construct an XML Instruction section
Parameters:
string content |
the body of the instruction segment |
Throws:
XIException if the segment body is illegal (contains ">")
Examples:
auto item = new XMLInstruction("ATTLIST");
- Compares two XML instructions for equality
Examples:
XMLInstruction item1,item2;
if (item1 == item2) { }
- Compares two XML instructions
You should rarely need to call this function. It exists so that
XmlInstructions can be used as associative array keys.
Examples:
XMLInstruction item1,item2;
if (item1 < item2) { }
nothrow @safe hash_t
toHash();
- Returns the hash of an XMLInstruction
You should rarely need to call this function. It exists so that
XmlInstructions can be used as associative array keys.
const const string
toString();
- Returns a string representation of this XmlInstruction
const const @property bool
isEmptyXML();
- Returns false always
class
ProcessingInstruction: std.xml.Item;
- Class representing a Processing Instruction section
this(string content);
- Construct a Processing Instruction section
Parameters:
string content |
the body of the instruction segment |
Throws:
PIException if the segment body is illegal (contains "?>")
Examples:
auto item = new ProcessingInstruction("php");
- Compares two processing instructions for equality
Examples:
ProcessingInstruction item1,item2;
if (item1 == item2) { }
- Compares two processing instructions
You should rarely need to call this function. It exists so that
ProcessingInstructions can be used as associative array keys.
Examples:
ProcessingInstruction item1,item2;
if (item1 < item2) { }
nothrow @safe hash_t
toHash();
- Returns the hash of a ProcessingInstruction
You should rarely need to call this function. It exists so that
ProcessingInstructions can be used as associative array keys.
const const string
toString();
- Returns a string representation of this ProcessingInstruction
const const @property bool
isEmptyXML();
- Returns false always
- Abstract base class for XML items
abstract bool
opEquals(Object
o);
- Compares with another Item of same type for equality
abstract int
opCmp(Object
o);
- Compares with another Item of same type
abstract nothrow @safe hash_t
toHash();
- Returns the hash of this item
abstract const const string
toString();
- Returns a string representation of this item
const const string[]
pretty(uint
indent);
- Returns an indented string representation of this item
Parameters:
uint indent |
number of spaces by which to indent child elements |
abstract const const @property bool
isEmptyXML();
- Returns true if the item represents empty XML text
class
DocumentParser: std.xml.ElementParser;
- Class for parsing an XML Document.
This is a subclass of ElementParser. Most of the useful functions are
documented there.
Standards:
XML 1.0
BUGS:
Currently only supports UTF documents.
If there is an encoding attribute in the prolog, it is ignored.
this(string xmlText_);
- Constructs a DocumentParser.
The input to this function MUST be valid XML.
This is enforced by the function's in contract.
Parameters:
xmltext |
the entire XML document as text |
- Class for parsing an XML element.
Standards:
XML 1.0
Note that you cannot construct instances of this class directly. You can
construct a DocumentParser (which is a subclass of ElementParser), but
otherwise, Instances of ElementParser will be created for you by the
library, and passed your way via onStartTag handlers.
const const @property const(Tag)
tag();
- The Tag at the start of the element being parsed. You can read this to
determine the tag's name and attributes.
void delegate(ElementParser parser)[string]
onStartTag;
- Register a handler which will be called whenever a start tag is
encountered which matches the specified name. You can also pass null as
the name, in which case the handler will be called for any unmatched
start tag.
Examples:
onStartTag["podcast"] = (ElementParser xml)
{
};
onStartTag["episode"] = &myEpisodeStartHandler;
onStartTag[null] = dg;
This library will supply your function with a new instance of
ElementHandler, which may be used to parse inside the element whose
start tag was just found, or to identify the tag attributes of the
element, etc.
Note that your function will be called for both start tags and empty
tags. That is, we make no distinction between <br></br>
and <br/>.
void delegate(const(Element) element)[string]
onEndTag;
- Register a handler which will be called whenever an end tag is
encountered which matches the specified name. You can also pass null as
the name, in which case the handler will be called for any unmatched
end tag.
Examples:
onEndTag["podcast"] = (in Element e)
{
};
onEndTag["episode"] = &myEpisodeEndHandler;
onEndTag[null] = dg;
Note that your function will be called for both start tags and empty
tags. That is, we make no distinction between <br></br>
and <br/>.
@property void
onText(Handler
handler);
- Register a handler which will be called whenever text is encountered.
Examples:
onText = (string s)
{
};
void
onTextRaw(Handler
handler);
- Register an alternative handler which will be called whenever text
is encountered. This differs from onText in that onText will decode
the text, wheras onTextRaw will not. This allows you to make design
choices, since onText will be more accurate, but slower, while
onTextRaw will be faster, but less accurate. Of course, you can
still call decode() within your handler, if you want, but you'd
probably want to use onTextRaw only in circumstances where you
know that decoding is unnecessary.
Examples:
onText = (string s)
{
};
@property void
onCData(Handler
handler);
- Register a handler which will be called whenever a character data
segement is encountered.
Examples:
onCData = (string s)
{
};
@property void
onComment(Handler
handler);
- Register a handler which will be called whenever a comment is
encountered.
Examples:
onComment = (string s)
{
};
@property void
onPI(Handler
handler);
- Register a handler which will be called whenever a processing
instruction is encountered.
Examples:
onPI = (string s)
{
};
@property void
onXI(Handler
handler);
- Register a handler which will be called whenever an XML instruction is
encountered.
Examples:
onPI = (string s)
{
};
- Parse an XML element.
Parsing will continue until the end of the current element. Any items
encountered for which a handler has been registered will invoke that
handler.
Throws:
various kinds of XMLException
const const string
toString();
- Returns that part of the element which has already been parsed
- Check an entire XML document for well-formedness
Parameters:
string s |
the document to be checked, passed as a string |
Throws:
CheckException if the document is not well formed
CheckException's toString() method will yield the complete heirarchy of
parse failure (the XML equivalent of a stack trace), giving the line and
column number of every failure at every level.
class
XMLException: object.Exception;
- The base class for exceptions thrown by this module
class
CommentException: std.xml.XMLException;
- Thrown during Comment constructor
class
CDataException: std.xml.XMLException;
- Thrown during CData constructor
class
XIException: std.xml.XMLException;
- Thrown during XMLInstruction constructor
class
PIException: std.xml.XMLException;
- Thrown during ProcessingInstruction constructor
class
TextException: std.xml.XMLException;
- Thrown during Text constructor
class
DecodeException: std.xml.XMLException;
- Thrown during decode()
class
InvalidTypeException: std.xml.XMLException;
- Thrown if comparing with wrong type
class
TagException: std.xml.XMLException;
- Thrown when parsing for Tags
class
CheckException: std.xml.XMLException;
- Thrown during check()
- Parent in heirarchy
- Name of production rule which failed to parse,
or specific error message
- Line number at which parse failure occurred
- Column number at which parse failure occurred