In-Memory Decompression of gzip using Zlib

At work I recently was given the task to support gzip-based HTTP response compression in a C++ MFC application. For a while I was convinced that there was no way to support in-memory decompression of gzip-compressed data using zlib, and so I wrote the gzipped data to a temporary file and then used the gzopen()/gzread()-family of functions in zlib to read it back (as described in the zlib manual).

I spent a lot of time looking at the zlib code to try to see why the in-memory streaming decompression function inflate() wasn’t working for gzip-compressed data and determined that zlib wasn’t recognizing the gzip headers. The inflate() function had code which purported to understand the headers, but it wasn’t being enabled due to reasons unknown. After some more sleuthing, I noticed the gzopen()/gzread() functions implemented their own gzip header detection and passed only the raw data to inflate(). Once I noticed this, I threw my hands up in disgust and said it couldn’t be done.

However, it turns out I was wrong. The header file zlib.h includes extensive comments for its functions, including much information that doesn’t exist in the zlib manual. Looking carefully through the header file, I found the following comment for the function inflateInit2:

ZEXTERN int ZEXPORT inflateInit2 OF((z_streamp strm,
                                     int windowBits));

   ...

     windowBits can also be greater than 15 for
   optional gzip decoding. Add 32 to windowBits
   to enable zlib and gzip decoding with automatic
   header detection, or add 16 to decode only
   the gzip format (the zlib format will return a
   Z_DATA_ERROR).

Passing 47 (15 + 32) for the windowBits parameter worked and allowed me to successfully use the zlib streaming inflate() functions.

For as widely used a library as zlib is, its code and API suck.

Advertisements