This is part 1/5 of my Deterministic Finalization and IDisposable
post series.
This topic has been covered many times by many others (such as here and here), so if you are familiar with C#’s using statement and IDisposable
interface, feel free to skip this post. I’m writing this introduction to provide the necessary background information to set up a series of subsequent posts.
Garbage collection, found in languages such as C# and Java (among many others), is a very useful feature: it largely alleviates the need for a programmer to manually handle resource management. The most commonly cited benefit is that garbage collection eliminates the need for the programmer to explicitly call heap memory management functions such as malloc
and free
; instead, the garbage collector automatically keeps track of whether objects are still in use and frees them when they are no longer needed.1 However, in addition to handling memory management, garbage collection may also release other scarce resources upon cleanup, such as file locks or network connections.
An important to point to note about most (all?) garbage collectors is that they are nondeterministic. This means that, in general, a programmer does not and should not know when the actual garbage collection phase happens.2 In other words, a program could stop using an object but its underlying memory may not be freed for seconds, minutes, hours, days, or possibly ever. Usually this is a good thing; it can often be a large performance boost.
However, as I mentioned above, garbage collection manages more than just memory. Consider what happens when you call .NET’s File.Open()
method, which returns a FileStream
object with which you can read and write bytes to the file. Unless explicitly specified otherwise, the FileStream
will create an exclusive lock on the underlying file; no other process (or thread) will be able to open the file for reading or writing while the FileStream
is open. Usually this isn’t much of a problem, as once the process has ended the file will be closed and most processes are short-lived.
Consider, if you will, the case where the process isn’t short-lived. Perhaps the process opened up the file and wrote to it without explicitly closing it, expecting the garbage collector to eventually notice that the process was done with the file and to close it, releasing the lock. However, as the garbage collector is nondeterministic, we simply don’t know when — if ever — the garbage collector will close the file, and the process will keep a lock on the file for potentially a very long time.4
Another way to illustrate the above problem is to consider the following C# code which first writes to a file and then immediately reopens the file to read from it; the code as shown is virtually guaranteed to fail.
|
|
Now, many developers will say “That’s easy to solve. Just call the FileStream.Close()
method when you are done with the FileStream
.” (A few may say call GC.Collect()
but that’s a bad idea3) OK, fine, let’s add the Close()
to the above code:
|
|
In the above code, what happens if writeStream.Write()
throws an exception which is caught and handled at a higher level? That’s right — Close()
is never called and once again you are dependent on the whims of the garbage collector to clean up the file.5
One common solution to the above problem is to wrap the code using a try {} finally {}
block. For example:
|
|
The C# developers, being pretty bright people, recognized that the above situation is actually fairly common — that in addition to garbage collection’s nondeterministic finalization, programs also often need a form of deterministic finalization to free scarce resources as soon as possible. To this end, they invented two concepts: the IDisposable
interface and the using
statement.
The IDisposable
interface contains exactly one method: Dispose()
. It is nothing but a cleanup method which uses a slightly more generic name than Close()
. Many diverse objects implement IDisposable
, from AsymmetricAlgorithm
to Image
to SqlConnection
. A list of direct implementers of IDisposable
in the .NET Class Library is here, but please note that it doesn’t include classes which indirectly implement IDisposable
by having a parent (or grandparent, or great-grandparent…) class which is a direct implementer.
The using
statement is basically nothing but syntactic sugar, as
|
|
… is more-or-less short for
|
|
The cast in the code fragment ((IDisposable) fs).Dispose();
is necessary because it is possible in C# to implement interface methods which are only exposed via that particular interface and not by the implementing class (see here). In other words, the following code won’t compile:
|
|
… whereas if you replace a.Dispose()
with ((IDisposable) a).Dispose();
it will. This was likely added to allow a class to implement two separate interfaces which have a method with an identical name and signature.
People familiar with C++ may note, as Herb Sutter did, that using and IDisposable are little but a more verbose (and perhaps uglier) form of a C++ destructor. Furthermore, since a C++ destructor is automatically executed (whether upon block exit for stack-based objects or upon delete
for heap-based objects), whereas Dispose()
must be explicitly invoked, one is much less likely to forget to call a C++ destructor (i.e. essentially never unless one leaks memory). This is important because it is usually bad to forget to call Dispose()
for any objects which implement IDisposable
once you are done with them. (By the way, Anders Hejlsberg, I wouldn’t mind a construct in C# which provides for automatically calling Dispose()
at block-end; it would help eliminate using
’s verbosity.)
In my upcoming posts, I will discuss some guidelines for writing classes which implement IDisposable
and then describe and demonstrate some useful classes which I have written that implement IDisposable
.
Footnotes
- If you are interested as to how the .NET garbage collector works, read the article Garbage Collector Basics and Performance Hints on MSDN.
- Savvy readers may be aware that many garbage collected languages provide a way for the programmer to force (more like strongly suggest) that a garbage collection happen at this instant — such as .NET’s
GC.Collect()
method3. - Extremely savvy readers may be aware that in general calling the
GC.Collect()
method is a bad idea. - File locking isn’t the only reason to worry about nondeterministic finalization of
FileStream
objects. Another concern is the fact thatFileStream
performs buffering, and the data won’t be flushed unlessFlush()
,Close()
, orDispose()
is called. Therefore, if you were to open up a file for writing with the permissiveFileShare.Read
flag (which probably isn’t a good idea in most cases), there’s a high probability that readers will see incomplete data until the aforementioned functions are called (either explicitly or through a form of deterministic finalization). - I used the example of file locking because it is close to heart. At a previous job I had to deal with the problem of a coworker inadvertently holding onto locks in perpetuity in a daemon process quite a few times. I presume the problem related to not closing the file when exceptions were thrown (otherwise it would have happened more often). Unfortunately the code was apparently poorly designed or not understood and the program was not fixed; instead the solution was to reboot the machine. Yow.