Using the Excel Object Model and Performance

Recently I’ve had to write a bit of code which communicates with Microsoft Excel using its object model. Here are a few things I have learned from this experience.

  • Interaction with the Excel object model seems to use some kind of inter-process communication with an Excel process that is started behind the scenes. If things are not shut down properly, this Excel process will continue to run indefinitely in the background. Be sure to periodically check the Task Manager for any runaway Excel processes — these typically indicate a bug in your code or incomplete shutdown (perhaps because you chose “Stop Debugging” from the debugger).
  • If you are using the Excel object model using MFC, follow the example in Microsoft KB Article 178781: HOWTO: Automate Excel Using MFC and Worksheet Functions. Be sure to call COleDispatchDriver::ReleaseDispatch() or use the COleDispatchDriver::m_bAutoRelease member on all relevant objects or the Excel process may never stop.
  • The Excel object model documentation (which Microsoft KB Article 222101: How To Find and Use Office Object Model Documentation helps you find) is quite horrible, at least as of Office 2000. It is also written exclusively with the VB developer in mind.
  • To give control of the running, hidden Excel process with which you are interacting to the user, use the following code:
    Excel::_Application app;
    // Create and work with app...
  • Even if Excel is not visible to the user, Application::Quit() may pop up a hidden dialog asking if the user wants to save the changes that were made through the dialog box. Since the dialog is not visible, Excel will never shut down. To prevent this dialog, either set Application.DisplayAlerts to false or set Workbook.Saved to true for all modified workbooks. The former is preferred.
  • Each call using the Excel object model is very, very slow, probably as a result of the use of IPC. This means that the typical way one would think of interacting with cell values in Excel — iterating cell-by-cell within a set of nested for loops — is often too slow to be practical. Instead, I work in selections of nRows rows by nCols columns and use a two-dimensional SAFEARRAY. For example:
    COleSafeArray rawData;
    DWORD rawDataDimensions[2];
    rawDataDimensions[0] = nRows;
    rawDataDimensions[1] = nCols;
    rawData.Create(VT_VARIANT, 2, rawDataDimensions);
    // Populate the values of rawData...
    // Select a range of size nRows x nCols
    Excel::Range range = wksheet.GetRange(varUpperLeftCell,
    // Set the cells' values in one call to .SetValue()
    // instead of setting individual cell values