Wednesday, December 19, 2012

More New Beginnings

A while back I spoke disdainfully about certain job-style choices that many among us have or haven't made, and in re-reading it, my lament seems harsh.  Especially now that I find myself back in "cubeland", wardrobe and all.  That's right, I have accepted a contract to work at a major downtown corporation - after a decade in the side-streets of business and working for myself.

Sell-out! they exclaimed, from their sofas, in their robes, with their cats.  Perhaps...  But I must explain some important motivating factors before allowing this judgement to stand.

Firstly, when I embarked, years ago on my solo career mission, I underestimated the importance of working with peers.  As a software artisan, one must never stop learning, as there is such an incredibly vast galaxy of things to know.  Learning from the inter web gets you only so far.  You need live minds around you, working together, to get the best solution out of a problem.  This is why code builders often assemble into teams.

I've written and shipped a lot of code as a one-person company.  I'm proud of all the work I did and the people I helped.  (There's nothing like the smile you get when you show someone how the days of tedious data shuffling she had to do each month will now take 5 minutes.)  Yet, as the projects piled up, I started to notice a pattern in my solutions - all Excel apps, of course - and began to wonder if some problems might actually be solved better by, say, a web app? Or some technique I just didn't know existed.

Having worked on software teams before, I sometimes longed for a session where I could white-board my ideas with some other "dudes", and stride out of the room absolutely confident that the design was right.  As I crafted my clever report engines and whatnot, I saw the internet evolving and getting more powerful as a software platform.  I began to hear about things like Sharepoint, jQuery and MVC.Net.  I glimpsed the world outside my home-office window beginning to slide past.

Then one day, at a luncheon with an old friend and former work associate, I mentioned my feelings of unease, and also that my billable hours were down significantly.  My friend indicated that his place of work, where he is a manager, was in need of a .Net programmer.  Even though that platform is not my forte, I used to work in it, and my broader experience and problem-solving skills would, my friend said, "more than make up for" any lack of specific syntactical knowledge, which I could learn.

This friend had tried to tempt me before, but this time I wavered.  Thinking deeply, I had to admit that the lifestyle I once described as optimal: working from home, answering to no boss, etc. - had actually started to lose its lustre.  Then I calculated the potential wage increase that regular hours would represent, and that lustre faded down to a dull, carpet-like finish.

Then, for balance, I considered the down-sides of working for "the man" again: a somewhat scary deep dive into new languages and paradigms, (and I'm no toddler.)  Then there's all the office crap, which I think I summed up pretty well in my March 1, 2010 post.  But it has been 10 years since I worked at a corporation - maybe things aren't so bad now.

Well, I'm two months in, and so far not suffering.  In fact, I've been able to make some really solid contributions, especially in applying my Excel skills and experience.  The first project I got into, a data import & cleanup tool, I ran with.  Using my artistic approach and applying my mantra of letting the user design it as they use it, I produced a solution far and beyond anything else they had ever seen.  (Feel the power of my Golden Hammer!)

The client (a working group among thousands of groups world-wide in this massive company) is really, really pleased, and we are now scoping out the next iteration; where I make Excel read data from PDF questionnaires.  There's a massive need, and this little Excel tool has the potential to save thousands of person-hours of data re-keying, for users across the country.  Rumour is that the CEO is even taking an interest in the project!

But that's just 20% of my time.  I'm also working on a "major" project using the new (to me) MVC.Net platform.  My C# and Javascript chops are slowly coming along, thanks to my patient and helpful associates, (and PluralSight.)  But I'm still able to pull my weight, since I'm well-versed in the concepts of relational data storage and retrieval, and SQL programming is a big chunk of the project logic.

Another way I am making this transition more palatable is by riding my bicycle to work.  It's getting cold now, here in Toronto, and the 8km journey is fraught with traffic hassles, but I'm sure the exercise is doing me good.  My building has a secure indoor bike room, and a change room with lockers and shower, so I can still look sharp in my new slacks and smooth shirts when I arrive at the elevator.

If you must, call me a sell-out... if by selling out you mean tripling my income, making strides to improve my knowledge and skills, and getting back to work in the engine room of the economy.  But I feel no shame in my reversal.  Now I am armed with some solid Excel skills that I can bring to a place that is sorely in need.  I will gain experience using the latest developing tools and techniques, and certainly discover new and better ways to help more people.   

Monday, October 8, 2012

Shape Charge

I was recently asked by a portfolio manager to create a new kind of chart, not available in Excel's standard set of chart styles:  a visual stock portfolio using proportionally sized boxes to represent the different stocks and bonds, coloured to show their respective investment categories, like this:

The initial prototype used a thousand tiny rows in a spreadsheet, with merged cells, background colours and cell borders to create the different boxes. This proved to be very difficult to render via code.  It also looked poor when printed, and was slow to scroll.

Looking for a better way, (because there's always a better way) I tried drawing objects.  With VBA you can easily draw a box at the position you specify, in the colour you want, and with whatever text you need inside.  It's very fast, and as I discovered, very efficient.

For a load test, I created a workbook with 100,000 rectangles in it, and the file size was only 48 kb!  And a bonus for the performance-aware: drawing boxes using VBA code is incredibly fast.

There are several ways you can set the background colour of a shape: hexadecimal, VB constants... But giving a nod to some basic design principals, I recommend using the workbook's colour pallet vial the SchemeColour property.  Excel offers a "pallet" of 80 colours which can be used in cell backgrounds, pie-chart slices, text, or shapes.  From a coding perspective, it's nice when your code refers to colour number 4, instead of &H2300FF45&

You can customize these colours, and copy pallets from one worksheet to another.  But there is one small trap to keep in mind when using SchemeColor in your VBA code: The 80 pallet slots could show up as totally different shades when your code is run on another workbook.

You see, Excel files inherit the pallet from the default template on the computer of the person that originally created it.  In my experience, many workbooks are actually copies of copies of some other model, and come with vastly different colour pallets. Colour 17 could be a nice olive drab on the test workbook I created on my laptop, but bright pink on the client's version of the report workbook.

To verify which SchemeColor values my Portfolio report should use, I created a handy macro to drop all 80 choices on a worksheet.  This test matrix is a handy reference when you need to know the colour assigned to each scheme number in the current workbook's pallet.

Below is the function I wrote to draw a sized rectangle in position on my report, as well as the macro to generate the test matrix.  I now generate this colour matrix in any model where I will need to programatically set the cell backgrounds, chart series colours and the like.  Tip: select a blank worksheet before generating the matrix.

Got the chops? Your challenge: extend the DrawBox function to handle light backgrounds where the default white text is tough to read.  Passing a text colour parameter would be one way, but figuring out how to invert it automatically would be plenty cooler.


Public Sub DrawBox(ByRef sht As Worksheet, _ 
                          ByVal lft As Single, _
                          ByVal tp As Single, _
                          ByVal wdth As Single, _
                          ByVal hite As Single, _
                          ByVal bg As Variant, _
                          ByVal strCaption As String)

'render a box using a floating box shape object
 
 Dim shap As Shape
 
'drop the rectangle on the worksheet
 Set shap = sht.Shapes.AddShape(msoShapeRectangle, lft, tp, wdth, hite)
 shap.Select
 
'set attributes
 With Selection.ShapeRange
   .Shadow.Visible = msoFalse
   .Fill.ForeColor.SchemeColor = CInt(bg)
   .Fill.Solid
   .Fill.Transparency = 0#
   .Line.Weight = 1
   .Line.DashStyle = msoLineSolid
   .Line.Style = msoLineSingle
   .Line.Transparency = 0#
   .Line.Visible = msoTrue
 End With
 
 With Selection.Characters
    .Text = strCaption
    .Font.Name = "Arial"
    .Font.FontStyle = "Regular"
    .Font.Size = 8
 End With
 
 With Selection
    .HorizontalAlignment = xlCenter
    .VerticalAlignment = xlCenter
 End With

'always release object memory when done
 Set shap = Nothing
 
End Sub

Public Sub BuildColorMatrix()

 'Generate a Pallet test matrix on the active sheet
 
  Dim idx As Integer
  Dim idy As Integer
  Dim intColor As Integer
  Dim shap As Shape

 'build a 12 x 12 matrix of shapes
  For idy = 0 To 11
     For idx = 0 To 11
        intColor = intColor + 1
        If intColor > 80 Then Exit Sub
        DrawBox ActiveSheet, idx * 80, idy * 80, 75, 75, intColor, CStr(intColor)
     Next
  Next

 'always release object memory when done
  Set shap = Nothing

End Sub


Thursday, April 12, 2012

Spreadsheets and Risk

Most Excel models start as flat spreadsheets, using formulas and functions to produce results.  The useful ones grow, and take on new responsibilities, and sometimes things get out of hand.

As an example: CompanyZ sells stuff, and needs to figure out when to buy its raw materials.  Factors such as shipping rates, lead time, and sales forecasts must be accounted for.

The purchasing guy draws up a model to calculate the buying numbers for the next three months, cleverly using variables and named ranges to allow for adjustments in the factors.  His boss likes it, and asks him to blow it out to 1 year.

Ok... not too difficult, since the formulas were set up right.  A bunch of new months are tacked on to the right end of the sheet.  Then the marketing guys ask him to break it down by region.  That's not so easy, because this is a brand new DIMENSION on the data.  Here's where a typical business model starts to become a "house of cards", as significant additions are hung onto a basic functional foundation.

As our models take on more and more of a mission-critical role, they expand into dozens of worksheets, with invisible formulas linking everything together.  Some key person understands the unique mechanics of it all, and every Excel jockey out there, (you know who you are,) has a personal spreadsheet-building style, usually self-learned and utterly unique.

Then that person quits or dies, and the model becomes a "black-box" for a while... still in use, still giving answers, but we're not 100% sure how.  Then the business environment changes, the model can't be understood and adapted, and it dies.  The company scrambles to replace it, and a small ripple of chaos rolls through the departments that relied on all those reports and forecasts that are suddenly unavailable.

This situation is very common.  Thousands of companies use millions of complex Excel models for critical business decision-making tasks.  Based on 20 years of experience in the field, and on the horror stories I've read on-line, most of the spreadsheets out there contain errors, and some of those errors can lead to catastrophic results.

To mitigate this risk, companies need to take active steps to manage their Excel assets.  The creation of spreadsheet-based tools needs to be approached the way other business software projects are: with planning, a solid design methodology, controlled testing and documentation.

This will entail extra cost, to be sure.  The required engineering resources may not be available.  But this is a budget question, which boils down to a matter of priority.  When the value at risk is actually computed, (what is the cost of losing this customer? Losing all customers?) the numbers might make sense.

A flat spreadsheet model is fine for one-off reports, or succinctly defined micro-problems that are purely mathematical.  At a certain point, as spreadsheets expand to fulfill more advanced forecasting and reporting duties, the flat spreadsheet needs to be treated as a working prototype for an actual Application, which can be audited, understood, fixed and validated.

Excel is still a viable platform, but complex models need to be assembled methodically, by someone who understands basic software engineering.  Object Classes should be used to abstract the different business entities.  A Relational Database should be used to store the data and intelligently retrieve it.  A Report Engine should be used to produce the output.

Such a system, with it's logic and structure openly visible in VBA code, will not break down and be cast aside, but will grow, prosper, and add far more value to the company than it's original cost.  Any programmer can step in, see what's happening, and make the required fixes or additions, confident that the whole thing won't come crumbling down.


Thursday, January 12, 2012

The Hidden Corners of Excel Hold Many Secrets

Some projects grow and grow, especially if they are effective and well-used applications.  One such goliath is a reporting tool I've been developing for about a year now.  Lately I've been building a new charting module for it.

The program has a sophisticated report engine which allows the user to design his own reports with filters, custom fields and formulas... very powerful.  The new chart builder is an extension to this.  It's coming along, but I've been reminded how tricky Excel chart object programming can be.  Coding in the chart zone reveals some of the dark hallways in Excel's VBA implementation that seemingly never got debugged by the engineers at Microsoft.

My application is 13,000 lines of code and growing, so I've run into a fair share of head-slappers on this project.  Coding VBA can sometimes make one feel like a salmon, swimming up stream; occasional WaTerFalls (WTF?) hinder progress, and perfectly reasonable code yields beguiling results and cryptic error messages such as "User-defined type not defined", or "Error xxx-xxx" (literally, with x's! I should have screen-grabbed that one.)  But despite these obstacles, I just keep trying different ideas and eventually find an alternate way.  That's what makes Excel development so exciting.

Recently my project has started acting strangely, like a creaky building, emitting unpleasant sounds and smells.  One contributing factor could be the sheer size of the workbook.  There are so many custom dialogs, report templates, code modules and worksheets that the file is now over 12MB.  Not obesely large, by most measures, but still a bit to big to send in the mail.  It takes longer to save too, (in conflict with the programmer's mantra: "save often"), and it feels like Excel is struggling (drunk on code?)  The VBA editor has started exhibiting weirdness such as random pauses, strange screen artifacts, and the odd workbook crash. There's even a mysterious worksheet object in the project explorer now, called "Sheet3", that doesn't appear to exist anywhere in the file.  I'm not the only person to experience this particular enigma, so my trust in Excel has wavered and it's a bit scary.  

Once, back in Excel 95, I created a system that, at a certain point, hit a wall and just imploded.  I don't know exactly what happened, but I was coding away one night and some buffer overflowed and I got the blue screen of death, and the workbook thereafter was totally corrupt and un-readable.  I restored a backup file, repeated the day's work I had lost, and then Bang! The same thing happened again.  That's when I informed the client it was time to upgrade the platform to a compiled application.

That was a long time ago.  For efficiency and agility I still prefer developing in Excel.  The ease of deployment is still unmatched, and Excel is much bigger and stronger now... but then again, so am I. My apps are larger and memory-hungryer, and growing in scale; reaching out to other programs and stretching the limits of the Excel platform in new ways.  I dread that all my hard work is built on top of some kind of fault, and one day Excel is going to split apart and swallow the whole thing up.

What can I do to avoid this?  An option I don't relish is to break my program up into smaller pieces: probably not practical for this particular application, but before I go that route I need to investigate other ways of correcting the workbook bloat. A careful code review is in order, to ensure all object variables are being deallocated correctly and the logic is as memory-efficient as possible.  Also, large workbooks tend to accumulate left-over data and toxic clutter such as conditional formatting.  This wastes power and grows the file.  A real-estate audit is the next step.

As another resort, I can migrate the entire program to a brand new, innocent, lean, clean workbook file.  There are tools out there to automate this task, but major applications will require a careful effort. It involves physically copying the contents of each worksheet to a new file.  Then the code from each VBA module and the controls from each user dialog are painstakingly copy-pasted across.  Also, all the named ranges and splits need to be replicated - don't forget the code in the worksheet and workbook objects, and all those object names and properties!

The trick is to NOT copy the Excel objects - just their contents.  This is the only way to leave behind all the invisible, ghostly protoplasm that is haunting your model.  In a workbook like mine, this is a big job that will involve a lot of testing and bug mitigation.  I think, for now, I'll keep that option my back pocket, for if things get hectic.

My client is running this on Windows 7 / Excel 2010, and it appears to be more stable than my 2007 / Windows XP / Mac / Parallels development environment..  Perhaps the coders at Microsoft managed to clear out some of those dusty hallways after all. But as a programmer I also need to remain vigilant against ill-handled errors, wasted screen resources, and memory leaks in my growing code opus.

It's what makes Excel development so exciting.

Tuesday, November 8, 2011

Maintenance-Free Solutions

As a software creator, I like to spend as much of my working time as possible inventing cool stuff and solving tricky problems using my creativity and skill.  But in order to get paid to do this as an independent consultant, one must also handle such tasks as Sales and Marketing (doing that now), Bookkeeping, Documentation, and the dreaded System Maintenance.

By maintenance, I'm not really talking about bug fixes.  Most of the time it's about changing business logic, connecting to other systems, and new user requirements.  

Some developers, and in my experience, most consulting "Firms", put a high value on the downstream Maintenance hours that come with any major software rollout.  To me, this Vendor-centric attitude sounds a bit like planned redundancy, or selling cheap printers so you can lock 'em in to a lifetime of overpriced ink.  I believe in craftsmanship, which means having a more client-centric approach.  And like I said, I'd rather be inventing new stuff than fixing old stuff.

I learned long ago that a bit of planning up front can minimize or even eliminate the maintenance burden.  Now, whenever I design a system for a client, I make sure to include some Maintenance-mitigation strategies in my design.

1. Keep it simple - this is obvious, but basically, the less code you have, the easier it will be to fix, change, extend, etc.  Do you really need a set of routines to populate, show and validate a dialog box, or could the user update the spreadsheet directly?  Bells and whistles are fun and exciting to build, but aren't always necessary.  A terse design will go farther between oil changes.

2. Build an engine - Reports, File Imports, Calculations... if you can build a data-driven structure to handle each of these, it becomes much easier to modify and extend your application.  Here's an example: my app needs to import data from a 3rd party file.  What if the data provider changes the file layout without warning?  If I hard-coded the import logic, it breaks.  But if I build an import engine that tracks the position of each import column, along with a simple screen where this configuration info can be updated, then the user (or I) can just update the config and re-run - job done.

3. Let the User Design it - This is the cornerstone of my software design strategy: I never want to hard-code the look, feel, or even workflow of a program.  If the app warrants it, I set up all the views as blank canvases which the users can modify to suit their needs.  This requires some planning:

  • You need to model all of your data entities as object classes
  • All lists and reports must use a data-driven layout - nothing hard-coded on worksheets
  • In a multi-user environment, Login accounts let individual users configure the app to suit their personal needs

This is a lot more work up front, but the goal is to avoid those calls asking you to move a field or rename a heading - now the users can do that themselves.  As you start to build a few solutions this way, you amass a library of ideas and code that you can re-use and improve from project to project.

The biggest challenge with this mission can be selling it.  Small budget projects might not warrant this level of future-proofing, and it will take time to educate the customer/boss on the potential benefits. You don't need to mention that the biggest benefit is not being interrupted by minor change requests when you're deep in the creative zone on the next cool project.

Don't give them a fish, make them anglers.

Thursday, September 8, 2011

Hard Lessons

In the last post I described the new hardware setup I recommended for a client, to improve a slow and memory-hungry Excel app.  The hardware is in place and the client is happy, but the transition didn't go smoothly.

I discovered that the hardware upgrade in itself was not enough to cure our ills.  After coming up against some disappointing results, despite the bump to Windows 7 and Excel 2010, I was obligated to re-write a big chunk of the data handling logic; something I was hoping to avoid.

My initial research led me to believe that 64 bit Excel has access to as much ram as you can slot into your machine, but this is not my experience.  The spreadsheet part of Excel might be bigger, faster, stronger and fresher smelling, but the VBA object space is not.

Upgrading my VBA code to work in Excel 2010 was actually a non-event.  I only found one line, (related to checking the amount of ram available, ironically) that choked.  Otherwise the macros and formulas all behaved exactly as designed.  Excel 2010 saves files in the Office 2007 file format, so that was my first clue that this isn't really that much of an Excel upgrade.

Then, after seeing Excel freeze, just like before, when my code tried to pre-load six million+ data points into the application's memory structures, I was beginning to think the whole ordeal and expense may have been for nothing.  Further reading on-line found some comments from a Microsoft engineer who wrote that they achieved significant performance gains on Excel 2010's built-in functions, but basically didn't do a thing to optimize or beef up the VBA compiler... that will be in the next release, supposedly.

Perhaps the recent push towards Cloud-based applications is drawing some of the programming talent away from the Excel group at Microsoft.  I'm a developer, so I can't really hold a grudge.  Excel is still a miracle of engineering in my opinion, and the best software ever to run on any computer anywhere. (Yes, I like it.)

Anyway, I had no choice but to re-work the data logic so the application fetches what it needs at report time, instead of pre-loading on start-up.  I made this change, and the app became usable again - even improved, from the client's perspective, because it "starts up a lot faster."  Reports now require additional processing, but the fast new chip and fast ram more than make up for it.  Also, thanks to the solid state hdd, (and possibly Widows7's superior design,) the memory swapping issue has totally gone away.

We didn't get the 800% performance boost that Moore's Law predicted, but our app's data ceiling is now a clear blue sky, and one busy brokerage is crunching numbers with it on a daily basis.  The client is armed with an unbelievably powerful stock analysis tool, and now I know what to expect (and what expectations to instil) the next time a client wants to level up.

Tuesday, July 12, 2011

Throw More Hardware At It

A very large data set can be taxing on Excel.  On older systems, Excel is limited to 1 GB of ram.  Office 2007 enables up to 2GB.  But opening a workbook that is hundreds of megabytes in size can be an exercise in patience.


A recent project is tracking statistics for 5,000 companies going back for several years.  Some numbers, like Earnings and Dividends are reported quarterly.  Stock Price, which ties into a host of downstream reporting and industry averaging logic, is tracked daily.


Rolling all these numbers together to figure out potential investment strategies requires juggling several million data values in memory.  The data is being housed in an Access database for now, with plans to bump up to SQL server.  But the data interchange is not the bottleneck, it is Excel's ram limit.


My client is running Windows XP and Excel 2003.  I have optimized and re-factored the object model to squeeze every last drop of performance from the platform, but we have hit the proverbial wall.  


As more and more data was captured and the record count approached 4.8 million, some of the long reporting macros began to get bogged down by Windows memory swapping - this is when the hard drive is used as the spillover for physical memory, and it is 100 times slower than normal Ram operations.  To make the application usable, we needed to either reduce the data universe (and thus, the value of the analysis), or implement a hardware and software upgrade.  We're taking the upgrade path.


The new machine is a Dell Latitude E5520, billed as a portable workstation - more pricy than a back-to-school netbook, but with the horsepower to churn through these advanced computations.  (This is, after all, the stuff computers were invented for.)  The laptop upgrade will run Windows 7 64 bit, and Office 2010 Pro, leaving nothing on the table.  Let's see what's in the package:
  • Intel Core i7-2620M, 2.70GHz - Four duo-cores, that's 8 processors!
  • 8.0GB, DDR3-1333MHz SDRAM - that's fast memory, and a lot of it.
  • 128GB Dell Mobility Solid State Drive - THIS is the money.
On the old hardware and platform, one rather large report that computes historical moving averages across industry sectors, can take up to 4 minutes to complete.  If Windows decides it's time to start swapping ram to the hard drive, macro time jumps to 15 minutes.  


Using a solid state hard drive, this won't be a concern as swaps to a ram disc are just about as fast as normal ram.  (and this kind of hard drive allows the computer to boot up in seconds, so I'm told.)  Exploiting the newest 64 bit Windows os and Office suite will allow Excel to access up to 1TB (that's terabyte) of memory.  


The Intel Core i7 is a linked array of 64 bit processors that makes this equivalent to a 90's era SuperComputer costing millions of dollars. Lets see if the performance gains pan out...

The hardware is on-order.  The client is waiting expectantly, as am I.  Next post, I will outline my steps in upgrading the Excel 03 vba code to run in Excel 2010 (if any) and benchmark the new system.  This is a $3,000 blind faith endeavour.  But as the client says, if he makes just one good decision because of the software, it's all paid for!