Tuesday, July 12, 2011

Throw More Hardware At It

A very large data set can be taxing on Excel.  On older systems, Excel is limited to 1 GB of ram.  Office 2007 enables up to 2GB.  But opening a workbook that is hundreds of megabytes in size can be an exercise in patience.


A recent project is tracking statistics for 5,000 companies going back for several years.  Some numbers, like Earnings and Dividends are reported quarterly.  Stock Price, which ties into a host of downstream reporting and industry averaging logic, is tracked daily.


Rolling all these numbers together to figure out potential investment strategies requires juggling several million data values in memory.  The data is being housed in an Access database for now, with plans to bump up to SQL server.  But the data interchange is not the bottleneck, it is Excel's ram limit.


My client is running Windows XP and Excel 2003.  I have optimized and re-factored the object model to squeeze every last drop of performance from the platform, but we have hit the proverbial wall.  


As more and more data was captured and the record count approached 4.8 million, some of the long reporting macros began to get bogged down by Windows memory swapping - this is when the hard drive is used as the spillover for physical memory, and it is 100 times slower than normal Ram operations.  To make the application usable, we needed to either reduce the data universe (and thus, the value of the analysis), or implement a hardware and software upgrade.  We're taking the upgrade path.


The new machine is a Dell Latitude E5520, billed as a portable workstation - more pricy than a back-to-school netbook, but with the horsepower to churn through these advanced computations.  (This is, after all, the stuff computers were invented for.)  The laptop upgrade will run Windows 7 64 bit, and Office 2010 Pro, leaving nothing on the table.  Let's see what's in the package:
  • Intel Core i7-2620M, 2.70GHz - Four duo-cores, that's 8 processors!
  • 8.0GB, DDR3-1333MHz SDRAM - that's fast memory, and a lot of it.
  • 128GB Dell Mobility Solid State Drive - THIS is the money.
On the old hardware and platform, one rather large report that computes historical moving averages across industry sectors, can take up to 4 minutes to complete.  If Windows decides it's time to start swapping ram to the hard drive, macro time jumps to 15 minutes.  


Using a solid state hard drive, this won't be a concern as swaps to a ram disc are just about as fast as normal ram.  (and this kind of hard drive allows the computer to boot up in seconds, so I'm told.)  Exploiting the newest 64 bit Windows os and Office suite will allow Excel to access up to 1TB (that's terabyte) of memory.  


The Intel Core i7 is a linked array of 64 bit processors that makes this equivalent to a 90's era SuperComputer costing millions of dollars. Lets see if the performance gains pan out...

The hardware is on-order.  The client is waiting expectantly, as am I.  Next post, I will outline my steps in upgrading the Excel 03 vba code to run in Excel 2010 (if any) and benchmark the new system.  This is a $3,000 blind faith endeavour.  But as the client says, if he makes just one good decision because of the software, it's all paid for!