Are You Ready for a Major Meltdown?

Several weeks ago, the New York Stock Exchange went down for about 4 hours.  While our work with small or medium business executives doesn’t have any relation to the size, sophistication, uptime requirements, or downtime cost of the NYSE systems, there are still lessons to be learned for any size organization:

1.  You have to figure that when it comes to investment in technology infrastructure, money is probably about as close to ‘growing on trees’ as you can get when you’re talking about the systems that run the NYSE.  So the lesson learned there is that you can’t just spend more to have better systems.  The dollars you invest are just one piece of the equation.  That isn’t to say that you shouldn’t invest in technology, it simply says that you need to make sure you’ve defined a strategy and risk profile, and that your investments align with those items.  If you haven’t had a discussion about what the technology related risks are to your business, and haven’t established your ‘profile’ or ‘attitude’ towards technology, then your investments are basically just a shot in the dark.

2.  You can’t prevent issues or outages with technology.  In the case of the NYSE, the cause of the outage was related to an update that was installed that interrupted communications between the NYSE and customers.  The thing that you CAN have in place is a plan for how you’re going to deal with it.  Much like in the first lesson above, if you’ve gone through a proper risk analysis exercise, you would have identified that installing updates is a risk, and you would have decided in advance what your mitigation strategy was going to be.  With that in place, as things start to blow up you don’t need to scramble around trying to figure out how to react – you just pull your plan off the shelf, turn to the appropriate page, and begin following the process.

By the way, if we hold the NYSE to the gold standard of system uptime:  “5 Nines” or 99.999%, they used up nearly 50 years of allowable downtime in this single incident.