Welcome!

Machine Learning Authors: Pat Romanski, Liz McMillan, Elizabeth White, Kevin Benedict, Larry Alton

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Cognitive Computing , Agile Computing, Cloud Security

@CloudExpo: Blog Feed Post

Using Cloud for Disaster Recovery

Business Case - Best Practices and Lessons Learned

Use of cloud for DR solutions is becoming more common, even the organizations which are not using cloud for mission critical production applications are moving towards using cloud for application DR.

Business Case for Using Cloud for the DR

  1. Faster Recovery Time Objective (RTO): Typically DR requires lengthy manual processes to fully restore the business applications at the DR site.  Having backup data and servers at the DR site is easy, however, restoring the entire application or service takes time.  E.g. full application restoration requires starting services in specified order, performing dns and other configuration updates etc.  In Cloud, the IaaS APIs provide ability to use automation solutions like Kaavo IMOD to fully restore the business applications automatically without manual intervention.  As a result organizations get predictable recovery and reduced RTO.  Automating the service or application recovery can reduce RTO to minutes from hours or days.

  2. Shorter Recovery Point Objective (RPO): Instead of relying on offsite tape backups, organizations can reduce their RPO to minutes by maintaining near real-time data backups in the Cloud.  For faster transfer of large data dedicated lines can be established between the customer datacenters and the cloud.  The cost of the dedicated line depends on the distance of the customer datacenter from the cloud providers' peering point.  For most use cases VPN lines over internet are sufficient for transferring data between customer datacenter and the cloud.

  3. Lower Costs: Typically organizations pay high price for standby infrastructure, especially servers at the DR site.  Using cloud there is no need to pay for the servers when they are not in use at the DR site.  Pay as you use infrastructure model significantly reduces DR costs without compromising the service levels.

Following are some of the best practices and lessons learned from the Cloud DR solutions we have implemented so far:

Cloud DR is Different than Traditional DR
Unlike traditional DR solutions which relies on having a backup infrastructure for the entire datacenter requiring large and costly implementation, Cloud DR can be implemented incrementally application by application.  For example it is common for organizations to have a large shared database with multiple schemas supporting various applications.  In majority of cases this sharing is driven by server consolidation to increase the utilization of internal infrastructure.  Not all applications using a shared database have same service level requirements.  Some applications are more critical than others, so as long as schemas and application data is different, it is better to remove the dependency on shared database by having the right size database for each application in the cloud.  This allows optimal prioritization and incremental delivery of the DR project based on the service levels of the individual applications.

Migration of Applications Using Single Sign-on with LDAP
When planning DR for individual applications it is important to identify the dependent services and making sure that the dependent services would be available as a part of the DR solution.  Enterprise customers typically use Single Sign-on with LDAP for managing authentication.  So best practice is to treat the Single Sign-on Service as the critical application and implement the DR solution for bringing up the Single Sign-on Service first during the DR process.  An automation solution like Kaavo IMOD enables customers to restore applications and services in the specified order automatically during DR without any manual intervention. During a real DR scenario there are many things going and it is easy to make mistakes under pressure if the application restoration process is not fully automated.  To prevent surprises during actual DR, it is important to have a fully automated solution for restoring applications and services.

Restoring Back to Normal Operations after DR
This is one area which is often overlooked or under planned in DR projects.  For companies using their own datacenters for production applications and using cloud for DR, processes and automation must be implemented to fully restore the applications in the customer production datacenter using the latest data from the cloud DR once the primary datacenter is back online.  This step is not required for applications which are using cloud as their primary site.  E.g. if an application is running in one cloud zone and after DR it is running in a different cloud zone there is no need to restore it back to the first cloud zone as long as service levels for both cloud zones are same.  If you are deploying new applications it best to design for failure.  E.g. a distributed application running across various regions and cloud providers eliminate the need for traditional DR planning for the application as handling of failure of individual components is built in the design and deployment model of the application.

Handling Compliance in Cloud, e.g., HIPAA, PCI, SOX, SAS-70 etc.
Using available security technologies and processes several companies have implemented applications in the cloud compliant to various compliance standards, e.g. HIPAA, PCI, SOX, SAS-70 etc.  Each compliance standard has its own nuances; basically with proper planning you can address all compliance related issues.  This is a big topic on its own so please contact us if you have specific questions about this.  Cloud providers have published various case studies and best practices, e.g. white paper by Amazon on HIPAA compliance.

Handling Public and Private DNS
A common use case for enterprise applications is to have a public DNS for public access and a private DNS over internal network for accessing the backend services and databases etc.  In these situations it is best to use virtual private cloud like AWS VPC or to overlay a private network with the same IP address range as internal datacenter on any public cloud using Open Source solutions (refer to this blog - Building a Private Cloud within a Public Cloud for details on how to implement a secure private network on any public cloud).  For updating the public DNS entries for the restored application in the cloud we use DNS automation services like AWS Route 53 or EasyDNS.  Leveraging these services, Kaavo IMOD automatically updates the Public DNS for the applications as a part of the restoration during DR.

Keeping Application Database Up-To-Date
It is common for applications to have large databases.  Moving the data to the cloud and keeping it current requires first loading the entire database in cloud and then sending and merging incremental data to the database in the cloud.  To address this use case instead of maintaining a hot backup we use Kaavo IMOD to automatically bring up the database servers in cloud whenever the new incremental backup is available and merge the incremental backup then save the merged database and shutdown the servers in the cloud.  This way in case of DR we always have the latest merged database available for restoring the application. This approach provides reasonable RTO without incurring the additional costs of maintaining a hot database backup.

Applying and Maintaining Patches
A typical application requires following two types of updates during its lifecycle:

  1. Updating Application Code: This is quite easy as using Kaavo IMOD we setup automation to pick up the latest code and configuration for the application from the production deployment.  This automation ensures that the application code and configuration changes for the new release of the application or service are available in the cloud for the DR.

  2. OS Patches and Third-Party Software Updates: Sometimes custom patches or updates to third party software or OS are required.  For these types of changes it is best to include them as a part of change control process requiring sign-off from the team owning the DR process.  The DR team can review the change and if required make and test the needed changes to DR automation for the application.

Read the original blog entry...

More Stories By Jamal Mazhar

Jamal Mazhar is Founder & CEO of Kaavo. He possesses more than 15 years of experience in technology, engineering and consulting with a range of Fortune 500 companies including GE and ING. He established ING’s “Center of Excellence for B2B” which streamlined $2 billion per month in electronic money transfer operations. As Lead Architect at GE Capital e-Business team, Jamal directed analysis and implementation efforts and improved the performance of the website generating more than $1 billion in annual lease revenues. At Trilogy he provided technical and managerial expertise for several large scale e-business implementation projects for companies such as Boeing, NCR, Gartner, British Airways, Quantas Airways and Alltel. Jamal has BS in Electrical and Computer Engineering from the University of Texas at Austin and MBA from NYU Stern School of Business.

@CloudExpo Stories
"With Digital Experience Monitoring what used to be a simple visit to a web page has exploded into app on phones, data from social media feeds, competitive benchmarking - these are all components that are only available because of some type of digital asset," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
FinTechs use the cloud to operate at the speed and scale of digital financial activity, but are often hindered by the complexity of managing security and compliance in the cloud. In his session at 20th Cloud Expo, Sesh Murthy, co-founder and CTO of Cloud Raxak, showed how proactive and automated cloud security enables FinTechs to leverage the cloud to achieve their business goals. Through business-driven cloud security, FinTechs can speed time-to-market, diminish risk and costs, maintain continu...
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
What sort of WebRTC based applications can we expect to see over the next year and beyond? One way to predict development trends is to see what sorts of applications startups are building. In his session at @ThingsExpo, Arin Sime, founder of WebRTC.ventures, discussed the current and likely future trends in WebRTC application development based on real requests for custom applications from real customers, as well as other public sources of information.
As businesses adopt functionalities in cloud computing, it’s imperative that IT operations consistently ensure cloud systems work correctly – all of the time, and to their best capabilities. In his session at @BigDataExpo, Bernd Harzog, CEO and founder of OpsDataStore, presented an industry answer to the common question, “Are you running IT operations as efficiently and as cost effectively as you need to?” He then expounded on the industry issues he frequently came up against as an analyst, and ...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of S...
"DX encompasses the continuing technology revolution, and is addressing society's most important issues throughout the entire $78 trillion 21st-century global economy," said Roger Strukhoff, Conference Chair. "DX World Expo has organized these issues along 10 tracks with more than 150 of the world's top speakers coming to Istanbul to help change the world."
"We focus on SAP workloads because they are among the most powerful but somewhat challenging workloads out there to take into public cloud," explained Swen Conrad, CEO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"As we've gone out into the public cloud we've seen that over time we may have lost a few things - we've lost control, we've given up cost to a certain extent, and then security, flexibility," explained Steve Conner, VP of Sales at Cloudistics,in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are focused on SAP running in the clouds, to make this super easy because we believe in the tremendous value of those powerful worlds - SAP and the cloud," explained Frank Stienhans, CTO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
"I think DevOps is now a rambunctious teenager – it’s starting to get a mind of its own, wanting to get its own things but it still needs some adult supervision," explained Thomas Hooker, VP of marketing at CollabNet, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are still a relatively small software house and we are focusing on certain industries like FinTech, med tech, energy and utilities. We help our customers with their digital transformation," noted Piotr Stawinski, Founder and CEO of EARP Integration, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"Peak 10 is a hybrid infrastructure provider across the nation. We are in the thick of things when it comes to hybrid IT," explained , Chief Technology Officer at Peak 10, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We've been engaging with a lot of customers including Panasonic, we've been involved with Cisco and now we're working with the U.S. government - the Department of Homeland Security," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are an IT services solution provider and we sell software to support those solutions. Our focus and key areas are around security, enterprise monitoring, and continuous delivery optimization," noted John Balsavage, President of A&I Solutions, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We're here to tell the world about our cloud-scale infrastructure that we have at Juniper combined with the world-class security that we put into the cloud," explained Lisa Guess, VP of Systems Engineering at Juniper Networks, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"I will be talking about ChatOps and ChatOps as a way to solve some problems in the DevOps space," explained Himanshu Chhetri, CTO of Addteq, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The financial services market is one of the most data-driven industries in the world, yet it’s bogged down by legacy CPU technologies that simply can’t keep up with the task of querying and visualizing billions of records. In his session at 20th Cloud Expo, Karthik Lalithraj, a Principal Solutions Architect at Kinetica, discussed how the advent of advanced in-database analytics on the GPU makes it possible to run sophisticated data science workloads on the same database that is housing the rich...