Deterministic Finalization and IDisposable Part 5: Useful IDisposable Class 3: AutoReleaseComObject

This is part 5 of my Deterministic Finalization and IDisposable post series.

This is the final example in my series on deterministic finalization in garbage-collected languages and the true motive behind the series: AutoReleaseComObject. The idea behind AutoReleaseComObject is simple: it is nothing but a wrapper around a COM object which calls Marshal.ReleaseComObject() upon Dispose() until the COM object’s reference count is 0 and the object is freed. Here’s the implementation:

public class AutoReleaseComObject : IDisposable
{
    private object m_comObject;
    private bool m_armed = true;
    private bool m_disposed = false;

    public AutoReleaseComObject(object comObject)
    {
        Debug.Assert(comObject != null);

        m_comObject = comObject;
    }

#if DEBUG
    ~AutoReleaseComObject()
    {
        // We should have been disposed using Dispose().
        Debug.Assert(false);
    }
#endif

    public object ComObject
    {
        get
        {
            Debug.Assert(!m_disposed);
            return m_comObject;
        }
    }

    public void Disarm()
    {
        Debug.Assert(!m_disposed);
        m_armed = false;
    }

    #region IDisposable Members

    public void Dispose()
    {
        Dispose(true);
#if DEBUG
        GC.SuppressFinalize(this);
#endif
    }

    #endregion

    protected virtual void Dispose(bool disposing)
    {
        if (!m_disposed)
        {
            if (m_armed)
            {
                int refcnt;
                do
                {
                    refcnt = Marshal.ReleaseComObject(m_comObject);
                } while (refcnt > 0);

                m_comObject = null;
            }

            m_disposed = true;
        }
    }
}

Why is this class so useful? Well, it has to do with a topic I’ve discussed before: Excel interop. As I insinuate in that post, a problem that users of the Excel object model often encounter is either runaway Excel processes which never quit, or multiple Excel processes when one would suffice. Furthermore, the Excel processes tend to stay around much longer than they have to. For C++, my solution was to either be sure to explicitly call COleDispatchDriver::ReleaseDispatch() or to use the COleDispatchDriver::m_bAutoRelease flag on all Excel objects (this is more than just the application: it is any Excel object such as Range or Workbook).

In C#, you can run into the same problem — basically the Excel process will stay around as long as any Excel COM interop object has a non-zero reference count. While I suspect the .NET Excel interop objects include code in their finalizers to decrement their COM reference counts to zero, which should mean that in the worst case the Excel process will end at the same time your .NET process ends, I think we can and should do better. After all, consider the implications if your .NET process is very long-lived, or if you repeatedly, serially interact with Excel (the system will likely unnecessarily launch many Excel processes).

The solution to these problems is to call Marshal.ReleaseComObject() on all Excel objects as soon as possible. Once all objects’ COM reference count reach zero, the Excel process will terminate. Therefore, I decided to wrap this functionality into the AutoReleaseComObject class.

Unfortunately, this makes using the Excel object model quite a bit more tedious. The casting becomes annoying, but this is easily solvable by writing a series of Excel object wrappers which inherit from AutoReleaseComObject and provide access to the wrapped object already casted to the appropriate type (I can’t wait for Whidbey’s generics). I called these objects ExcelApplicationWrapper, ExcelWorkbookWrapper, etc. and their implementation and use should be fairly obvious. However, consider what happens if you execute the following code:

using (ExcelApplicationWrapper excelAppWrapper =
           new ExcelApplicationWrapper(new Excel.Application()))
using (ExcelWorkbookWrapper workbookWrapper =
           new ExcelWorkbookWrapper(excelAppWrapper.Application.Workbooks.Add(Excel.XlWBATemplate.xlWBATWorksheet)))
{
    // ... Do work with workbook
}

Looks fine, doesn’t it? Wrong. excelAppWrapper.Application.Workbooks is itself an Excel object model object which also must be wrapped in AutoReleaseComObject in order for our desired behavior to happen. You need to be very careful to catch and wrap all Excel objects or you are back to square one in having near-immortal Excel processes. The above code should properly be written:

using (ExcelApplicationWrapper excelAppWrapper =
           new ExcelApplicationWrapper(new Excel.Application()))
using (ExcelWorkbooksWrapper workbooksWrapper =
           new ExcelWorkbooksWrapper(excelAppWrapper.Application.Workbooks))
using (ExcelWorkbookWrapper workbookWrapper =
           new ExcelWorkbookWrapper(workbooksWrapper.Workbooks.Add(Excel.XlWBATemplate.xlWBATWorksheet)))
{
    // ... Do work with workbook
}

Happy interop!

Advertisements

Deterministic Finalization and IDisposable Part 4: Useful IDisposable Class 2: AutoDeleteFile

This is part 4 of my Deterministic Finalization and IDisposable post series.

I guess my definition of tomorrow is much longer than I thought, but here’s another useful IDisposable class which I shall present without comment: AutoDeleteFile.

using System;
using System.Diagnostics;
using System.IO;

/// <summary>
/// A file wrapper which automatically deletes the file unless Disarm()
/// is called.
/// </summary>
public sealed class AutoDeleteFile : IDisposable
{
    private FileInfo m_underlyingFile;
    private bool m_armed = true;
    private bool m_disposed = false;

    public AutoDeleteFile(FileInfo underlyingFile)
    {
        Debug.Assert(underlyingFile != null);

        m_underlyingFile = underlyingFile;
    }

    ~AutoDeleteFile()
    {
        Dispose(false);
    }

    public FileInfo File
    {
        get { return m_underlyingFile; }
    }

    public void Disarm()
    {
        m_armed = false;
    }

    #region IDisposable Members

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    #endregion

    private void Dispose(bool disposing)
    {
        if (!m_disposed)
        {
            if (m_armed)
            {
                try
                {
                    m_underlyingFile.Delete();
                }
                catch (Exception)
                {
                    // If we can't delete, oh well!
                }
            }

            m_disposed = true;
        }
    }
}

Deterministic Finalization and IDisposable Part 3: Useful IDisposable Class 1: TimedLock

This is part 3 of my Deterministic Finalization and IDisposable post series.

For the first example of a useful custom class which implements IDisposable, I will simply link to and reproduce Ian Griffith’s TimedLock — an enhancement of the C# lock statement which allows the specification of a timeout period instead of blocking forever while trying to obtain the lock.

The code for TimedLock is reproduced below:

using System;
using System.Threading;

// Thanks to Eric Gunnerson for recommending this be a struct rather
// than a class - avoids a heap allocation.
// Thanks to Change Gillespie and Jocelyn Coulmance for pointing out
// the bugs that then crept in when I changed it to use struct...
// Thanks to John Sands for providing the necessary incentive to make
// me invent a way of using a struct in both release and debug builds
// without losing the debug leak tracking.
public struct TimedLock : IDisposable
{
    public static TimedLock Lock (object o)
    {
        return Lock (o, TimeSpan.FromSeconds (10));
    }

    public static TimedLock Lock (object o, TimeSpan timeout)
    {
        TimedLock tl = new TimedLock (o);
        if (!Monitor.TryEnter (o, timeout))
        {
#if DEBUG
            System.GC.SuppressFinalize(tl.leakDetector);
#endif
            throw new LockTimeoutException ();
        }

        return tl;
    }

    private TimedLock (object o)
    {
        target = o;
#if DEBUG
        leakDetector = new Sentinel();
#endif
    }

    private object target;

    public void Dispose ()
    {
        Monitor.Exit (target);

        // It's a bad error if someone forgets to call Dispose,
        // so in Debug builds, we put a finalizer in to detect
        // the error. If Dispose is called, we suppress the
        // finalizer.
#if DEBUG
        GC.SuppressFinalize(leakDetector);
#endif
    }

#if DEBUG
    // (In Debug mode, we make it a class so that we can add a finalizer
    // in order to detect when the object is not freed.)
    private class Sentinel
    {
        ~Sentinel()
        {
            // If this finalizer runs, someone somewhere failed to
            // call Dispose, which means we've failed to leave
            // a monitor!
            System.Diagnostics.Debug.Fail("Undisposed lock");
        }
    }
    private Sentinel leakDetector;
#endif
}

public class LockTimeoutException : ApplicationException
{
    public LockTimeoutException () : base("Timeout waiting for lock")
    {
    }
}

It is trivial to use TimedLock instead of lock in your applications. Simply change statements from:

lock (objectToLock)
{
    ... Do work while holding lock
}

… to:

using (TimedLock.Lock(objectToLock))
{
    ... Do work while holding lock
}

Others have enhanced TimedLock even futher, such as by having it keep track of the stack trace of the thread which is holding the lock.

Deterministic Finalization and IDisposable Part 1: The Basics

This is part 1 of my Deterministic Finalization and IDisposable post series.

This topic has been covered many times by many others (such as here and here), so if you are familiar with C#’s using statement and IDisposable interface, feel free to skip this post. I’m writing this introduction to provide the necessary background information to set up a series of subsequent posts.

Garbage collection, found in languages such as C# and Java (among many others), is a very useful feature: it largely alleviates the need for a programmer to manually handle resource management. The most commonly cited benefit is that garbage collection eliminates the need for the programmer to explicitly call heap memory management functions such as malloc and free; instead, the garbage collector automatically keeps track of whether objects are still in use and frees them when they are no longer needed.1 However, in addition to handling memory management, garbage collection may also release other scarce resources upon cleanup, such as file locks or network connections.

An important to point to note about most (all?) garbage collectors is that they are nondeterministic. This means that, in general, a programmer does not and should not know when the actual garbage collection phase happens.2 In other words, a program could stop using an object but its underlying memory may not be freed for seconds, minutes, hours, days, or possibly ever. Usually this is a good thing; it can often be a large performance boost.

However, as I mentioned above, garbage collection manages more than just memory. Consider what happens when you call .NET’s File.Open() method, which returns a FileStream object with which you can read and write bytes to the file. Unless explicitly specified otherwise, the FileStream will create an exclusive lock on the underlying file; no other process (or thread) will be able to open the file for reading or writing while the FileStream is open. Usually this isn’t much of a problem, as once the process has ended the file will be closed and most processes are short-lived.

Consider, if you will, the case where the process isn’t short-lived. Perhaps the process opened up the file and wrote to it without explicitly closing it, expecting the garbage collector to eventually notice that the process was done with the file and to close it, releasing the lock. However, as the garbage collector is nondeterministic, we simply don’t know when — if ever — the garbage collector will close the file, and the process will keep a lock on the file for potentially a very long time.4

Another way to illustrate the above problem is to consider the following C# code which first writes to a file and then immediately reopens the file to read from it; the code as shown is virtually guaranteed to fail.

string filename = ...;

FileStream writeStream = File.Open(filename, FileMode.Create, FileAccess.Write);
writeStream.Write(...);

// The following line is virtually guaranteed to throw an Exception as
// it cannot open the file because writeStream will not have been garbage
// collected yet.
FileStream readStream = File.Open(filename, FileMode.Open, FileAccess.Read);

Now, many developers will say “That’s easy to solve. Just call the FileStream.Close() method when you are done with the FileStream.” (A few may say call GC.Collect() but that’s a bad idea3) OK, fine, let’s add the Close() to the above code:

string filename = ...;

FileStream writeStream = File.Open(filename, FileMode.Create, FileAccess.Write);
writeStream.Write(...);
writeStream.Close();

In the above code, what happens if writeStream.Write() throws an exception which is caught and handled at a higher level? That’s right — Close() is never called and once again you are dependent on the whims of the garbage collector to clean up the file.5

One common solution to the above problem is to wrap the code using a try {} finally {} block. For example:

string filename = ...;

FileStream writeStream = null;
try
{
    writeStream = File.Open(filename, FileMode.Create, FileAccess.Write);
    writeStream.Write(...);
}
finally
{
    if (writeStream != null)
        writeStream.Close();
}

The C# developers, being pretty bright people, recognized that the above situation is actually fairly common — that in addition to garbage collection’s nondeterministic finalization, programs also often need a form of deterministic finalization to free scarce resources as soon as possible. To this end, they invented two concepts: the IDisposable interface and the using statement.

The IDisposable interface contains exactly one method: Dispose(). It is nothing but a cleanup method which uses a slightly more generic name than Close(). Many diverse objects implement IDisposable, from AsymmetricAlgorithm to Image to SqlConnection. A list of direct implementers of IDisposable in the .NET Class Library is here, but please note that it doesn’t include classes which indirectly implement IDisposable by having a parent (or grandparent, or great-grandparent…) class which is a direct implementer.

The using statement is basically nothing but syntactic sugar, as

using (FileStream fs = File.Open(filename, FileMode.Create, FileAccess.Write))
{
    ... do work with fs
}

… is more-or-less short for

FileStream fs = null;
try
{
    fs = File.Open(filename, FileMode.Create, FileAccess.Write);
    ... do work with fs
}
finally
{
    if (fs != null)
    {
        ((IDisposable) fs).Dispose();
    }
}

The cast in the code fragment ((IDisposable) fs).Dispose(); is necessary because it is possible in C# to implement interface methods which are only exposed via that particular interface and not by the implementing class (see here). In other words, the following code won’t compile:

class A : IDisposable
{
    void IDisposable.Dispose() { ... }
}

A a = new A();
a.Dispose();

… whereas if you replace a.Dispose() with ((IDisposable) a).Dispose(); it will. This was likely added to allow a class to implement two separate interfaces which have a method with an identical name and signature.

People familiar with C++ may note, as Herb Sutter did, that using and IDisposable are little but a more verbose (and perhaps uglier) form of a C++ destructor. Furthermore, since a C++ destructor is automatically executed (whether upon block exit for stack-based objects or upon delete for heap-based objects), whereas Dispose() must be explicitly invoked, one is much less likely to forget to call a C++ destructor (i.e. essentially never unless one leaks memory). This is important because it is usually bad to forget to call Dispose() for any objects which implement IDisposable once you are done with them. (By the way, Anders Hejlsberg, I wouldn’t mind a construct in C# which provides for automatically calling Dispose() at block-end; it would help eliminate using’s verbosity.)

In my upcoming posts, I will discuss some guidelines for writing classes which implement IDisposable and then describe and demonstrate some useful classes which I have written that implement IDisposable.

[1] If you are interested as to how the .NET garbage collector works, read the article Garbage Collector Basics and Performance Hints on MSDN.
[2] Savvy readers may be aware that many garbage collected languages provide a way for the programmer to force (more like strongly suggest) that a garbage collection happen at this instant — such as .NET’s GC.Collect() method3.
[3] Extremely savvy readers may be aware that in general calling the GC.Collect() method is a bad idea.
[4] File locking isn’t the only reason to worry about nondeterministic finalization of FileStream objects. Another concern is the fact that FileStream performs buffering, and the data won’t be flushed unless Flush(), Close(), or Dispose() is called. Therefore, if you were to open up a file for writing with the permissive FileShare.Read flag (which probably isn’t a good idea in most cases), there’s a high probability that readers will see incomplete data until the aforementioned functions are called (either explicitly or through a form of deterministic finalization).
[5] I used the example of file locking because it is close to heart. At a previous job I had to deal with the problem of a coworker inadvertently holding onto locks in perpetuity in a daemon process quite a few times. I presume the problem related to not closing the file when exceptions were thrown (otherwise it would have happened more often). Unfortunately the code was apparently poorly designed or not understood and the program was not fixed; instead the solution was to reboot the machine. Yow.