A couple of weeks back Delta Air Lines cancelled thousands of flights.
At first it was attributed to a combination of a “software glitch” and power cut. Next the power company responded, saying there had been no disruption to services in the area.
In a statement to the press, the company’s CEO finally shed some light on the source of the mass cancellations – an electrical component had failed and some servers weren’t connected to backup power.
The result? Tens of thousands of customers enduring a torrid few days and a reported $120 million dollar loss.
The event highlights how easily a complex IT environment can fail, especially when hybrid estates involve a mix of owner-operated data centers, co-location and cloud technologies.
It also demonstrates how data center dependent we have become – almost every industry relies on a data center to operate and the cost of a failure is massive. It costs $8,851 dollars for every minute of data center downtime.
Southwest’s outage is estimated to have cost $82 million. The cost of the outage is more than just lost time and revenue – there are refunds, compensation vouchers, staff overtime and damage to the brand’s image.
The mechanical and electronic components of a plane used to be all a passenger had to worry about failing, now you might not even make it off the ground due to a crashed check-in system or a drop in a mobile app that hosts your electronic boarding pass.
Delta is investing over $150 million in improving its IT, however without real-time visibility into where systems are, no amount of money can prevent a situation such as this week’s from happening again.
The reasons behind the outage are unfolding each day, however one fact is clear – if you operate an airline, now is the time to assess the resilience of your data center environments.