Creating Cloud DR? Know What's in Your SLA

So many organizations are turning to cloud for specific services, applications, and new kinds of business economics. We’re seeing more deploying into cloud and a lot more maturity around specific kinds of cloud services.

Consider this, according to Cisco, global cloud traffic crossed the zettabyte threshold in 2014, and by 2019, more than four-fifths of all data center traffic will be based in the cloud. Cloud traffic will represent 83 percent of total data center traffic by 2019. Significant promoters of cloud traffic growth include the rapid adoption of and migration to cloud architectures and the ability of cloud data centers to handle significantly higher traffic loads. Cloud data centers support increased virtualization, standardization, and automation. These factors lead to better performance as well as higher capacity and throughput.

One really great use-case is using cloud for disaster recovery (DR), backup, and resiliency purposes. And, with this topic in mind, one of the most important things to develop when deploying a DR environment with a third-party host is the SLA. This is where an organization can define very specific terms as far as hardware replacement, managed services, response time and more. Remote, cloud-based data centers, just like localized ones, need to be monitored and managed. When working with a third-party provider, host or colo, make sure specific boundaries are set and clearly understood as far as who is managing what.

Leverage provider flexibility. Hosting providers have the capability of being very flexible. They can setup a contract stating that they will only manage the hardware components of a rented rack. Everything from the hypervisor and beyond, in that case, becomes the responsibility of the customer. Even in these cases, it’s important to know if an outage has occurred or if there are failed components. Basically, the goal is to maintain constant communication with the remote environment. Administrators must know what is happening on the underlying hardware even if they are not directly responsible for it. Any impact on physical DR resources can have major repercussions on any workload running on top of that hardware.

Similarly, there are new cloud services which can take over the entire DRBC function and even have failover sites ready as needed. Remember, critical workloads and higher the uptime requirements will need to have special SLA provisions and cost considerations.

  • Define business recovery requirements. When developing an SLA for a cloud or hosting datacenter, it’s important to clearly define the recovery time objective – that is, how long will components be down? Some organizations require that they maintain 99.9 percent uptime with many of their critical components. In these situations, it’s very important to ensure proper redundancies are in place to allow for failed components. This can all be built into an SLA and monitored on the backend with good tools which have visibility into the DR environment. Let me give you a specific example. If you’re leveraging Microsoft’s Cool vs Hot storage tiers – there are some uptime considerations. Microsoft highlighted that you will be able to choose between Hot and Cool access tiers to store object data based on its access pattern. However, the Cool tier offers 99 percent availability, while the Hot tier offers 99.9 percent.

So, you absolutely need to design around your own DR and continuity requirements. If an organization has a recovery objective of 0-4 hours, it’s acceptable to have some downtime, but not long. With this type of DR setup, an SLA will still be setup with clear responsibilities being segregated between the provider and the customer. Having an open level of communication and clear environmental visibility will save a lot of time and effort should an emergency situation occur.

  • Plan, train, and prepare for the future. In a DR moment, everyone needs to know what they are supposed to do in order to bring their environment back up quickly. This must be clearly defined in your runbook, especially if you’re leveraging DR and business continuity services from a host or cloud provider. Most of all, when creating SLAs, make sure you plan for bursts, and what your environment will require in the near future. Restructuring SLAs and hosting contracts can be pricey – especially for critical DR systems. This means planning will be absolutely critical.

Cloud computing and the various services it provides will continue to impact organizations of all sizes. Organizations are reducing their data center footprints while still leveraging powerful services which positively impact users and the business. Using cloud for DR and business continuity is a great idea when it’s designed properly. Today, cloud services are no longer for major organizations. Mid-market and SMBs are absolutely leveraging the power of the resilient cloud. Moving forward, cloud will continue to impact organizations as they transition into a more digital world. And, having a good partnership (and SLA) with your cloud provider helps support a growing business, and an evolving user.

Source: TheWHIR