When Salesforce.com suffered an expensive twelve-hour outage earlier this year, the CEO was forced to apologize to users over social media. This is the kind of scenario that all business leaders dread.
The outage was attributed to a power distribution failure in Salesforce’s primary data center. Although the number of users was not confirmed, customers across many parts of the US were affected.
A similar incident in the UK last month saw users of the SSP insurance technology platform unable to renew their client’s insurance policies after two outages in quick succession. These were caused by power problems that affected the storage system at SSP’s data center, and although the initial damage was repaired, further hardware complications meant service had to be restored from an entirely new data center site.
Many businesses have started offering, and using, SaaS models in recent years, but few commit 100% to the cloud because of fears that an outage will leave them in a vulnerable position. Last year there was a spate of downtime incidents ranging from a power failure at Fujitsu’s data center in Sunnyvale, California which affected the company’s public cloud, through to several outages at Google data centers that affected its Cloud Compute Engine.
To mitigate the risks, businesses are choosing to use a hybrid of providers and cloud services, so that if one data center goes down, it only affects a proportion of its customer base. But unfortunately large outages can and do happen, often due to human error or extreme weather.
The key is to constantly assess facility resilience; to install solutions that deliver real-time information about the data center environment and the likelihood of problems arising. Disaster prevention is a better strategy than disaster recovery.
No facility can be entirely risk-free, but visibility into how systems are working, how they can withstand power outages or extreme weather conditions, or the careful monitoring of air conditioning, heating and water will help to reduce that risk. Monitoring assets is also important, as missing or failing assets can lead to service outages and unnecessary hardware costs.
Cloud-based operations will continue to grow and with that the number of applications that are considered mission or business critical. It is up to data centers to ensure they can meet the demand and provide an uninterrupted service.