Thursday, January 12, 2012

The Hidden Corners of Excel Hold Many Secrets

Some projects grow and grow, especially if they are effective and well-used applications.  One such goliath is a reporting tool I've been developing for about a year now.  Lately I've been building a new charting module for it.

The program has a sophisticated report engine which allows the user to design his own reports with filters, custom fields and formulas... very powerful.  The new chart builder is an extension to this.  It's coming along, but I've been reminded how tricky Excel chart object programming can be.  Coding in the chart zone reveals some of the dark hallways in Excel's VBA implementation that seemingly never got debugged by the engineers at Microsoft.

My application is 13,000 lines of code and growing, so I've run into a fair share of head-slappers on this project.  Coding VBA can sometimes make one feel like a salmon, swimming up stream; occasional WaTerFalls (WTF?) hinder progress, and perfectly reasonable code yields beguiling results and cryptic error messages such as "User-defined type not defined", or "Error xxx-xxx" (literally, with x's! I should have screen-grabbed that one.)  But despite these obstacles, I just keep trying different ideas and eventually find an alternate way.  That's what makes Excel development so exciting.

Recently my project has started acting strangely, like a creaky building, emitting unpleasant sounds and smells.  One contributing factor could be the sheer size of the workbook.  There are so many custom dialogs, report templates, code modules and worksheets that the file is now over 12MB.  Not obesely large, by most measures, but still a bit to big to send in the mail.  It takes longer to save too, (in conflict with the programmer's mantra: "save often"), and it feels like Excel is struggling (drunk on code?)  The VBA editor has started exhibiting weirdness such as random pauses, strange screen artifacts, and the odd workbook crash. There's even a mysterious worksheet object in the project explorer now, called "Sheet3", that doesn't appear to exist anywhere in the file.  I'm not the only person to experience this particular enigma, so my trust in Excel has wavered and it's a bit scary.  

Once, back in Excel 95, I created a system that, at a certain point, hit a wall and just imploded.  I don't know exactly what happened, but I was coding away one night and some buffer overflowed and I got the blue screen of death, and the workbook thereafter was totally corrupt and un-readable.  I restored a backup file, repeated the day's work I had lost, and then Bang! The same thing happened again.  That's when I informed the client it was time to upgrade the platform to a compiled application.

That was a long time ago.  For efficiency and agility I still prefer developing in Excel.  The ease of deployment is still unmatched, and Excel is much bigger and stronger now... but then again, so am I. My apps are larger and memory-hungryer, and growing in scale; reaching out to other programs and stretching the limits of the Excel platform in new ways.  I dread that all my hard work is built on top of some kind of fault, and one day Excel is going to split apart and swallow the whole thing up.

What can I do to avoid this?  An option I don't relish is to break my program up into smaller pieces: probably not practical for this particular application, but before I go that route I need to investigate other ways of correcting the workbook bloat. A careful code review is in order, to ensure all object variables are being deallocated correctly and the logic is as memory-efficient as possible.  Also, large workbooks tend to accumulate left-over data and toxic clutter such as conditional formatting.  This wastes power and grows the file.  A real-estate audit is the next step.

As another resort, I can migrate the entire program to a brand new, innocent, lean, clean workbook file.  There are tools out there to automate this task, but major applications will require a careful effort. It involves physically copying the contents of each worksheet to a new file.  Then the code from each VBA module and the controls from each user dialog are painstakingly copy-pasted across.  Also, all the named ranges and splits need to be replicated - don't forget the code in the worksheet and workbook objects, and all those object names and properties!

The trick is to NOT copy the Excel objects - just their contents.  This is the only way to leave behind all the invisible, ghostly protoplasm that is haunting your model.  In a workbook like mine, this is a big job that will involve a lot of testing and bug mitigation.  I think, for now, I'll keep that option my back pocket, for if things get hectic.

My client is running this on Windows 7 / Excel 2010, and it appears to be more stable than my 2007 / Windows XP / Mac / Parallels development environment..  Perhaps the coders at Microsoft managed to clear out some of those dusty hallways after all. But as a programmer I also need to remain vigilant against ill-handled errors, wasted screen resources, and memory leaks in my growing code opus.

It's what makes Excel development so exciting.

No comments:

Post a Comment