Implementing IXmlWriter Part 7: Cleaning Up

This is part 7 of my Implementing IXmlWriter post series.

Wow, I can’t believe that it’s been over a month already since my last IXmlWriter post. I guess my vacation ruined my exercise plan and my blogging habits. It’s well past time to get back into both.

Rather than introduce a new test case, I’m going to spend today “cleaning up” the previous version of IXmlWriter.

The first cleanup method is trivial but overdue — I will separate the implementation of IXmlWriter from its interface, as a user of IXmlWriter shouldn’t be particularly concerned about its implementation. In other words, I will separate it into a .h and a .cpp file. For now, IXmlWriter will continue to expose some implementation details (e.g. its private members), but these details should change relatively infrequently. If I ever need to completely separate its implementation from its interface, I will consider making it into a COM-like object or using the Pimpl idiom.

Secondly, I will disable the default, compiler-created copy constructor and assignment operator. While they would work under the current implementation, there isn’t a unit test ensuring they will continue to work as I change the implementation. I’m also not exactly sure how useful these abilities are — why would someone want to copy an instance of IXmlWriter? Finally, imagine the confusion if a user accidentally declared a function to use an IXmlWriter rather than an IXmlWriter&. In general, I think it is best to disable a class’s copy constructor and assignment operator by default and enable them only as required.

Finally, and perhaps more controversially, I didn’t like that the definitions of the translations required for character data and attribute values were stored as member variables of the class as they seemed to pollute its “private interface” somewhat. Even worse, I didn’t like that these translations were initialized every time the class is instantiated; the translations are constant data and this initialization seems wasteful. Finally, I’m not so sure a std::map object is the best choice to store these values as it seems unnecessarily complex. Therefore, I decided to store the translations as an array of structs and modify the TranslateString() function accordingly.

Here’s the resulting header file:

// StringXmlWriter.h

class StringXmlWriter
{
private:
    std::stack<std::string> m_openedElements;
    std::string m_xmlStr;
    bool m_unclosedStartElement;

public:
    StringXmlWriter();

    std::string GetXmlString() const;
    void WriteAttributeString(const std::string& localName,
                              const std::string& value);
    void WriteElementString(const std::string& localName,
                            const std::string& value);
    void WriteEndElement();
    void WriteStartElement(const std::string& localName);
    void WriteString(const std::string& value);

private:
    // Disable copy construction and assignment
    StringXmlWriter(const StringXmlWriter&);
    StringXmlWriter& operator=(const StringXmlWriter&);
};

Here’s the implementation file:

// StringXmlWriter.cpp

#include "StringXmlWriter.h"

#define ARRAYSIZE(x) ( sizeof(x) / sizeof(x[0]) )

struct CharTranslation
{
    char OriginalChar;
    const char* ReplacementString;
};

static const CharTranslation AttributeValueTranslations[] =
{
    { '"', "&quot;" },
    { '&', "&amp;" },
};

static const CharTranslation CharDataTranslations[] =
{
    { '&', "&amp;" },
    { '<', "&lt;" },
    { '>', "&gt;" },
};

struct OriginalCharEquals :
    public std::binary_function<CharTranslation, char, bool>
{
    bool operator() (const CharTranslation& translation, char ch) const
    {
        return (translation.OriginalChar == ch);
    }
};

static std::string TranslateString(const std::string& originalStr,
                                   const CharTranslation* translations,
                                   int numTranslations)
{
    // Actually one past end, needed for proper std::find_if semantics
    const CharTranslation* endTranslations = translations + numTranslations;

    std::string translatedStr;
    for (std::string::const_iterator stringIter = originalStr.begin();
         stringIter != originalStr.end();
         ++stringIter) {
        char ch = *stringIter;

        const CharTranslation* translation = std::find_if
            (
            translations,
            endTranslations,
            std::bind2nd(OriginalCharEquals(), ch)
            );
        if (translation != endTranslations) {
            translatedStr += translation->ReplacementString;
        } else {
            translatedStr += ch;
        }
    }

    return translatedStr;
}

StringXmlWriter::StringXmlWriter() : m_unclosedStartElement(false)
{
}

std::string StringXmlWriter::GetXmlString() const
{
    return m_xmlStr;
}

void StringXmlWriter::WriteAttributeString(const std::string& localName,
                                           const std::string& value)
{
    m_xmlStr += ' ';
    m_xmlStr += localName;
    m_xmlStr += "=\"";
    m_xmlStr += TranslateString
        (
        value,
        AttributeValueTranslations,
        ARRAYSIZE(AttributeValueTranslations)
        );
    m_xmlStr += '"';
}

void StringXmlWriter::WriteElementString(const std::string& localName,
                                         const std::string& value)
{
    WriteStartElement(localName);
    WriteString(value);
    WriteEndElement();
}

void StringXmlWriter::WriteEndElement()
{
    if (m_unclosedStartElement) {
        m_xmlStr += "/>";
        m_unclosedStartElement = false;
    } else {
        std::string lastOpenedElement = m_openedElements.top();
        m_xmlStr += "</";
        m_xmlStr += lastOpenedElement;
        m_xmlStr += '>';
    }
    m_openedElements.pop();
}

void StringXmlWriter::WriteStartElement(const std::string& localName)
{
    if (m_unclosedStartElement) {
        m_xmlStr += '>';
        m_unclosedStartElement = false;
    }

    m_openedElements.push(localName);
    m_xmlStr += '<';
    m_xmlStr += localName;
    m_unclosedStartElement = true;
}

void StringXmlWriter::WriteString(const std::string& value)
{
    if (m_unclosedStartElement) {
        m_xmlStr += '>';
        m_unclosedStartElement = false;
    }

    m_xmlStr += TranslateString
        (
        value,
        CharDataTranslations,
        ARRAYSIZE(CharDataTranslations)
        );
}

By the way, if you can figure out a more elegant way to perform the translations, I’d love to hear it. I looked at std::transform but that can only do T->T translations. I wish I could have figured out a way to avoid writing the OriginalCharEquals function object and used something like std::equal_to, but I couldn’t find one which could compare the char OriginalChar member of CharTranslation to a separate char variable. Finally, I think the character-by-character translation method is likely slow and produces a rather large number of reallocations.

About Steven Engelhardt, CFA, AIF
Adjunct Professor of Software Engineering at DePaul University • Software Engineering, Data & Analytics in FinTech • Lives in Chicago, IL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s