STL Objects and Win32 Module Boundaries

Let’s say you have the following function:

void AppendChar(std::string& s, char ch)
{
    s += ch;
}

What happens if this function is exported as an ordinal function from a DLL (not an inlined piece of code inside a header) and you call it from an EXE?

It works most of the time. When it doesn’t, it corrupts your heap and causes a spectacular mess.

In Windows you must free memory with the same allocator that allocated it. However, your EXE may not share the same allocator as the DLL. Perhaps the two modules are linked against different versions of libc, or perhaps one of the modules is using a static version of libc. If your EXE and DLL do not share an allocator and if AppendChar resizes the string s, you will almost certainly cause a heap corruption.

The STL performs a lot of reallocations behind the scenes for you; this is one of its major benefits. Unfortunately, if you are writing a general-purpose DLL these behind-the-scene allocations are deadly. You cannot know or dictate what version of libc your clients will use.

Therefore, I reiterate my previous recommendation:

Avoid passing STL objects as parameters to DLLs.

Converting C++ Member Functions into Function Objects

Let’s say you have a C++ function that takes a function object as a parameter and calls it:

template <typename _Fn>
void call_functor(_Fn fn)
{
    fn();
}

Now let’s say you want to pass a class’s member function to call_functor() above, as in:

class C
{
    void foo() { std::cout << "foo()\n"; }
};
     
C c;
call_functor(/* What do I put here? c.foo and &C::foo don’t work */);

The STL has a pointer-to-member function adapter called std::mem_fun() which almost gets us there. Unfortunately, it doesn’t quite meet our needs because it requires us to pass a pointer to an instance of C, as in:

C c;
std::mem_fun(&C::foo)(&c); // The &c is the problem

However, we can use std::mem_fun() if we could figure out a way to create a new functor around std::mem_fun() with &c bound as its first parameter. Unfortunately, we cannot use the STL binders (std::bind1st and std::bind2nd) because they only work on binary functions, not unary functions.

In the general case, you should use Boost’s very powerful binding library bind. However, let’s write our own simple binder for expository purposes.

First, we need a function, bind(), that returns a function object which binds a parameter, p1, to a unary function object, func. We’ll call the returned function object binder.

template <typename _Func, typename _P1>
inline binder<_Func, _P1> bind(_Func func, _P1 p1)
{
    return binder<_Func, _P1>(func, p1);
}

The class binder should store func and p1 and have an operator() which calls func with p1 as its parameter. For simplicity we’ll assume func returns void:

template <typename _Func, typename _P1>
class binder
{
public:
    binder(_Func func, _P1 p1) :
        func_(func), p1_(p1) {}
    void operator()() const { return func_(p1_); }
     
private:
    _Func func_; // The functor to apply
    _P1 p1_; // The first paramter
};

We can now solve the initial problem by combining our bind() with std::mem_fun():

call_functor(bind(std::mem_fun(&C::foo), &c));

We can make usage a little more convenient by introducing a macro:

#define mem_fun_functor(c, memFn) bind(std::mem_fun(memFn), &c)
     
call_functor(mem_fun_functor(c, &C::foo));

There’s plenty of room for improvements, but it’s amazing what you can do with a little template trickery.

STL Map Use

What’s wrong with the following code?

template<typename T1, typename T2>
struct my_pair
{
    typedef T1 first_type;
    typedef T2 second_type;

    my_pair() : first(T1()), second(T2()) {}
    my_pair(const T1& v1, const T2& v2) : first(v1), second(v2) {}

    T1 first;
    T2 second;
};

template<typename T1, typename T2>
inline bool operator<
    (
    const my_pair<T1, T2>& x,
    const my_pair<T1, T2>& y
    )
{
    return (x.first < y.first || x.second < y.second);
}

void f()
{
    typedef my_pair<..., ...> key_type;
    typedef ... value_type;
    typedef std::map<key_type, value_type> map_type;

    map_type map;
    // Use map
}

Answer: my_pair cannot be used as a key for a STL map because the operator< violates the rule of strict weak ordering. More specifically, the operator is not antisymmetric. Consider the following:

typedef my_pair<int, int> P;
P p1(3, 5);
P p2(2, 6);

Both p1 < p2 and p2 < p1 evaluate to true.

If you use my_pair as a key into a map you will not get any compiler errors or warnings but you will likely see some bizarre runtime behavior. For example, keys which were successfully insert()ed may not be able to be found using find().

A correct version of operator< (which places minimal burden on T1 and T2 — they are only required to implement operator<) is:

template<typename T1, typename T2>
inline bool operator<
    (
    const my_pair<T1, T2>& x,
    const my_pair<T1, T2>& y
    )
{
    return
        (
        x.first < y.first ||
        (!(y.first < x.first) && x.second < y.second)
        );
}

STL Vector Use

I recently wrote a piece of code that looked something like the following:

static const int NUM_TOTAL_VALUES = ...;
typedef ... T;

// Create vec and reserve NUM_TOTAL_VALUES spaces for later insertion
std::vector<T> vec(NUM_TOTAL_VALUES);

// Insert values into vec
for (int i = 0; i != NUM_TOTAL_VALUES; ++i)
    vec.push_back(...);

// vec should now have NUM_TOTAL_VALUES values in it (but doesn't!)

What’s wrong with this code?

The constructor vector(size_type _Count); does more than just allocate enough space to store _Count items — it also inserts _Count (default constructed) items into the vector. To reserve space without actually inserting values, use reserve():

static const int NUM_TOTAL_VALUES = ...;
typedef ... T;

std::vector<T> vec;
vec.reserve(NUM_TOTAL_VALUES);

for (int i = 0; i != NUM_TOTAL_VALUES; ++i)
    vec.push_back(...);

// vec now has NUM_TOTAL_VALUES values in it, as intended.

Write Functions Which Take Iterators, Not Collections

If my experience is typical, this is a very common construct:

ReturnType Function
    (
    const std::vector<T>& container
    )
{
    typedef std::vector<T>::const_iterator iterator_t;
    for (iterator_t iter = container.begin();
         iter != container.end();
         ++iter) {
        // Work with *iter
    }
}

The problem with this construct is that you have forced a container choice upon the user of your function. Slightly better, and basically your only choice when interoping with C, is this:

ReturnType Function
    (
    T* array,
    int numItems
    )
{
    for (int i = 0; i < numItems; ++i) {
        // Work with array[numItems]
    }

    // Or perhaps:
    // for (T* pT = array; pT != array + numItems; ++pT) {
    //     Work with *pT
    // }
}

With the above construct you can pass in any container which uses contiguous storage, such as an array or a std::vector (yes, std::vectors are guaranteed to store the data contiguously). Passing a std::vector to the above function looks like:

std::vector<T> v;
ReturnType ret = Function(v.empty() ? 0 : &v[0], v.size());

However, in C++ its far better to do as the STL does and write your function to accept iterators, as in:

template <class InputIterator>
ReturnType Function
    (
    InputIterator first,
    InputIterator last
    )
{
    for (InputIterator iter = first; iter != last; ++iter) {
        // Work with *iter
    }
}

By using this construct, you allow vast usage flexibility. Try to limit yourself to input iterator expressions on first and last (basically preincrement, dereference, and comparison) to minimize the requirements the InputIterator class must fulfill.

Most (all?) STL containers can pass their contents to Function() by using the containers’ begin() and end() functions, as in:

std::vector<T> v;
ReturnType ret = Function(v.begin(), v.end());

As C pointers are random-access iterators, you can pass arrays to Function() as follows:

const int arraySize = ...;
T array[arraySize];

ReturnType ret = Function(array, array + arraySize);

By the way, this lesson also applies to C#: prefer writing functions which accept IEnumerables rather than collections such as arrays. (In C# 2.0, you should be able to regain the lost typesafety by accepting IEnumerable<T>)

Prefer Iteration To Indexing

I’ve seen the following STL construct countless times:

std::vector<T> container;

for (int i = 0; i < container.size(); ++i) {
    // Work with container[i]
}

Unless otherwise necessary, it is better to use an STL iterator because it enables you to more easily change the underlying container. You can isolate the code changes required to one line by using typedef, as in:

typedef std::vector<T> container_t;
container_t container;

// Or ::const_iterator as necessary
for (container_t::iterator iter = container.begin();
     iter != container.end();
     ++iter) {
    // Work with *iter
}

Note that I wrote iter != container.end() as opposed to iter < container.end(). The former only requires an input iterator, while the latter requires a random access iterator—a more complicated iterator type supported by fewer STL containers.

Also note that I wrote ++iter as opposed to iter++. In general, you should prefer the former expression because it is always at least as efficient as the latter, and often times more so.

Of course, for a code block like the one above, you really should consider using the STL algorithm for_each.