Blog

Implementing IXmlWriter Part 6: Escaping Attribute Content
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-12
Implementing IXmlWriter Part 6: Escaping Attribute Content

This is part 6/14 of my Implementing IXmlWriter post series.

Last time’s IXmlWriter has a serious bug: it doesn’t properly handle attribute value escaping and can lead to malformed XML.

Consider the following test case:

1
2
3
4
5
6
7
8
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
  xmlWriter.WriteStartElement("element");
    xmlWriter.WriteAttributeString("att", "\"");
  xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();

std::string strXML = xmlWriter.GetXmlString();

The previous version of IXmlWriter will generate the XML string <root><element att="""/></root>, which is invalid and will be rejected by a XML parser. The rules for XML attribute escaping are given by Section 2.3 of the XML 1.0 spec—specifically, the AttValue literal:

Read more...
Implementing IXmlWriter Part 5: Supporting WriteAttributeString()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-11
Implementing IXmlWriter Part 5: Supporting WriteAttributeString()

This is part 5/14 of my Implementing IXmlWriter post series.

Today I will add support for writing attributes to yesterday’s version of IXmlWriter.

To learn more about attributes, see the W3C XML 1.0 Recommendation. Writing attributes will be supported with the function WriteAttributeString().

Here’s today’s test case:

1
2
3
4
5
6
7
8
9
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
  xmlWriter.WriteStartElement("element");
    xmlWriter.WriteAttributeString("att", "value");
  xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();

std::string strXML = xmlWriter.GetXmlString();
// strXML should be <root><element att="value"/></root>

Because the changes in Implementing IXmlWriter Part 4 keep start elements unclosed until another function is called which requires them to be closed (e.g. WriteString() and WriteEndElement()), adding support for writing attributes is very simple. Here’s the version I came up with to pass all test cases:

Read more...
Implementing IXmlWriter Part 4: Collapsing Empty Elements
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-10
Implementing IXmlWriter Part 4: Collapsing Empty Elements

This is part 4/14 of my Implementing IXmlWriter post series.

One of the enhancements that XML introduced over SGML was a shorthand for specifying an element with no content by adding a trailing slash at the end of an open element. For example, <br/> is equivalent to <br></br>. Let’s add this functionality to the previous version of IXmlWriter.

Here’s the test case:

1
2
3
4
5
6
7
8
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
  xmlWriter.WriteStartElement("emptyElement");
  xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();

std::string strXML = xmlWriter.GetXmlString();
// strXML should be <root><emptyElement/></root>

How does this affect our previous implementation?

Read more...
Implementing IXmlWriter Part 3: Supporting WriteElementString()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-07
Implementing IXmlWriter Part 3: Supporting WriteElementString()

This is part 3/14 of my Implementing IXmlWriter post series.

Today’s addition to the previous iteration of IXmlWriter is quite trivial: supporting the WriteElementString() method.

Here’s the test case:

1
2
3
4
5
6
7
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
  xmlWriter.WriteElementString("element", "value");
xmlWriter.WriteEndElement();

std::string strXML = xmlWriter.GetXmlString();
// strXML should be <root><element>value</element></root>

Implementation is extremely simple because WriteElementString() is nothing but a convenience method which calls WriteStartElement(), WriteString(), and WriteEndElement(). Therefore, here’s the new StringXmlWriter:

Read more...
Implementing IXmlWriter Part 2: Escaping Element Content
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-06
Implementing IXmlWriter Part 2: Escaping Element Content

This is part 2/14 of my Implementing IXmlWriter post series.

In the previous post of this series, we ended up with a simple class which could write XML elements and element content to a std::string. However, this code has a common, serious problem that was mentioned in my post Don’t Form XML Using String Concatenation: it doesn’t properly escape XML special characters such as & and <. This means that if you call WriteString() with one of these characters, your generated XML will be invalid and will not be able to be parsed by an XML parser.

Read more...
Implementing IXmlWriter Part 1: The Basics
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-09-30
Implementing IXmlWriter Part 1: The Basics

This is part 1/14 of my Implementing IXmlWriter post series.

After writing my blog post Don’t Form XML Using String Concatenation, I realized that writing a C++ System.Xml.XmlWriter workalike involves some interesting challenges. Therefore, I’ve decided to write a series of blog posts about building a streaming C++ XML generator, a.k.a. IXmlWriter, step-by-step. For this series, I will follow the practice of test-driven development and write a test case followed by an implementation which passes the test case. Future posts’ test cases will be constructed to illustrate bugs in or new features desired from the previous post’s implementation. The test cases will be constructed with the goal of having IXmlWriter be as similar to System.Xml.XmlWriter as possible.

Read more...
Don’t Form XML Using String Concatenation
XML / XPath / XSLT c++ xml
Published: 2005-09-16
Don't Form XML Using String Concatenation

It seems very common for developers to create XML using string concatenation, as in:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
std::string CreateXML
    (
    const std::string& strValue
    )
{
    std::string strXML("<tag>");
    strXML += strValue;
    strXML += "</tag>";
    return strXML;
}

As any experienced XML developer knows, this code has a bug: strValue must be escaped (& must be converted to &amp;, < must be converted to &lt;, etc.) or the XML that is generated will not be well-formed and will not be able to be parsed by an XML parser. One could write a simple function to handle this escaping, but there are so many other issues that can potentially creep up with XML generation—string encoding, illegal element and attribute names, collapsing empty elements, making sure all opened elements are closed, the performance of string concatenation, etc.—that I recommend putting all this logic into a single class. In fact, I suggest modelling the class after .NET’s streaming XML generator, System.Xml.XmlWriter.

Read more...
Be Careful With Doubles And C++ Streams
C++ c++
Published: 2005-09-12
Be Careful With Doubles And C++ Streams

I ran across a piece of code recently that was using ostrstream to convert a double to a string. The code looked something like:

1
2
3
4
5
6
7
std::string DoubleToString(double d) {
    std::ostrstream ostr;
    ostr << d << std::ends;
    std::string str(ostr.str());
    ostr.freeze(false);
    return str;
}

This function was used to convert doubles to strings for insertion into an XML document, which were eventually parsed in an XSLT by the XPath number() function. Most of the time it worked fine, but for really large numbers the number() function failed and return NaN. Why?

Read more...
Use RAII
C++ c++ win32
Published: 2005-09-09
Use RAII

This is covered by any halfway-decent C++ book, but I believe it deserves reiteration: Use the RAII idiom. I don’t think I could explain RAII any better than HackCraft does in The RAII Programming Idiom.

Let me demonstrate how to use RAII with a semi-contrived example. Here’s the pre-RAII code:

1
2
3
4
5
6
7
8
9
HMODULE hm = LoadLibrary(_T("user32.dll"));
if (hm != NULL) {
    FARPROC proc = GetProcAddress(hm, "MessageBoxW");
    if (proc != NULL) {
        typedef int (WINAPI *FnMessageBoxW)(HWND, LPCWSTR, LPCWSTR, UINT);FnMessageBoxW fnMessageBoxW = (FnMessageBoxW) proc;
        fnMessageBoxW(NULL, L"Hello World!", L"Hello World", MB_OK);
    }
    FreeLibrary(hm);
}

In this case, the resource wrapped is HMODULE, the resource acquisition function is LoadLibrary(), and the resource release function is FreeLibrary(). Beware of resources which have multiple resource release functions, such as Win32’s HANDLE with FindClose() and CloseHandle(); for these cases you will typically have to write multiple RAII classes. I will call the wrapper class HModule. Here’s how its use will look:

Read more...