WriteStartDocument() writes the XML declaration (i.e. <?xml version="1.0"?>) and WriteEndDocument() closes all open attributes and elements and sets the IXmlWriter back in the initial state. Adding support for these functions is straightforward. Note that I have introduced a new IXmlWriter state called WriteState_Prolog; this will be important later.
These functions are (obviously) used to denote the start and end of an attribute; the attribute value is written using WriteString() (this usage is analogous to WriteStartElement() and WriteEndElement()). Because WriteString() must now be aware of whether it is writing an attribute value or element content, I must keep track of the state the IXmlWriter is in — a change that affects nearly every function.
This is part 7/14 of my Implementing IXmlWriter post series.
Wow, I can’t believe that it’s been over a month already since my last IXmlWriter post. I guess my vacation ruined my exercise plan and my blogging habits. It’s well past time to get back into both.
Rather than introduce a new test case, I’m going to spend today “cleaning up” the previous version of IXmlWriter.
The first cleanup method is trivial but overdue — I will separate the implementation of IXmlWriter from its interface, as a user of IXmlWriter shouldn’t be particularly concerned about its implementation. In other words, I will separate it into a .h and a .cpp file. For now, IXmlWriter will continue to expose some implementation details (e.g. its private members), but these details should change relatively infrequently. If I ever need to completely separate its implementation from its interface, I will consider making it into a COM-like object or using the Pimpl idiom.
The previous version of IXmlWriter will generate the XML string <root><element att="""/></root>, which is invalid and will be rejected by a XML parser. The rules for XML attribute escaping are given by Section 2.3 of the XML 1.0 spec—specifically, the AttValue literal:
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
xmlWriter.WriteStartElement("element");
xmlWriter.WriteAttributeString("att", "value");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be <root><element att="value"/></root>
Because the changes in Implementing IXmlWriter Part 4 keep start elements unclosed until another function is called which requires them to be closed (e.g. WriteString() and WriteEndElement()), adding support for writing attributes is very simple. Here’s the version I came up with to pass all test cases:
This is part 4/14 of my Implementing IXmlWriter post series.
One of the enhancements that XML introduced over SGML was a shorthand for specifying an element with no content by adding a trailing slash at the end of an open element. For example, <br/> is equivalent to <br></br>. Let’s add this functionality to the previous version of IXmlWriter.
Here’s the test case:
1
2
3
4
5
6
7
8
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
xmlWriter.WriteStartElement("emptyElement");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be <root><emptyElement/></root>
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
xmlWriter.WriteElementString("element", "value");
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be <root><element>value</element></root>
Implementation is extremely simple because WriteElementString() is nothing but a convenience method which calls WriteStartElement(), WriteString(), and WriteEndElement(). Therefore, here’s the new StringXmlWriter:
This is part 2/14 of my Implementing IXmlWriter post series.
In the previous post of this series, we ended up with a simple class which could write XML elements and element content to a std::string. However, this code has a common, serious problem that was mentioned in my post Don’t Form XML Using String Concatenation: it doesn’t properly escape XML special characters such as & and <. This means that if you call WriteString() with one of these characters, your generated XML will be invalid and will not be able to be parsed by an XML parser.
This is part 1/14 of my Implementing IXmlWriter post series.
After writing my blog post Don’t Form XML Using String Concatenation, I realized that writing a C++ System.Xml.XmlWriter workalike involves some interesting challenges. Therefore, I’ve decided to write a series of blog posts about building a streaming C++ XML generator, a.k.a. IXmlWriter, step-by-step. For this series, I will follow the practice of test-driven development and write a test case followed by an implementation which passes the test case. Future posts’ test cases will be constructed to illustrate bugs in or new features desired from the previous post’s implementation. The test cases will be constructed with the goal of having IXmlWriter be as similar to System.Xml.XmlWriter as possible.
As any experienced XML developer knows, this code has a bug: strValue must be escaped (& must be converted to &, < must be converted to <, etc.) or the XML that is generated will not be well-formed and will not be able to be parsed by an XML parser. One could write a simple function to handle this escaping, but there are so many other issues that can potentially creep up with XML generation—string encoding, illegal element and attribute names, collapsing empty elements, making sure all opened elements are closed, the performance of string concatenation, etc.—that I recommend putting all this logic into a single class. In fact, I suggest modelling the class after .NET’s streaming XML generator, System.Xml.XmlWriter.