Sitecore Experience Manager

Introducing HADR

Abstract

Learn about the different Sitecore high availability disaster recovery (HADR) offerings.

The Sitecore high availability disaster recovery (HADR) service covers:

  • High availability (HA) – A system that aims to ensure an agreed level of operational performance. Usually the uptime is higher than for the normal period, where the system can failover by itself.

  • Disaster recovery (DR) – An area of business continuity planning that aims to protect an organization from the effects of significant negative events. Disaster recovery allows an organization to maintain or quickly resume mission-critical functions following a disaster that requires manual intervention.

Sitecore Managed Cloud uses the Sitecore Disaster Recovery service to offer the following three options to maintain HADR:

Use the following table as a reference to decide which recovery option works best for your requirements, consider:

  • How quickly your site would need to be back online in the event of an outage.

  • The recovery point objective (RPO).

  • The recovery time objective (RTO).

Specifications

Basic recovery

Hot-warm recovery

Hot-hot recovery

Backup technology

  • SQL Azure Geo-Restore.

  • Azure APIs.

  • SQL Azure Geo-Replication.

  • Azure API’s.

  • SQL Azure Geo-Replication.

  • Azure APIs.

Secondary environment

Created on-demand.

Fully deployed, but shut down.

Created when set up.

Recovery process

Deploy > Restore > Customer Validation > Go live.

Wake up > Customer Validation > Go live.

Discover an error > Use custom logic to failover.

Relative cost

The least expensive.

More expensive.

The most expensive.

RPO

*

*

*

RTO

*

*

*

* This varies depending on the following factors: the size of your solution, network latency, volume of data, and installation time. For accurate information this must be tested on a case by case basis.

The way to determine if there is a need for failover is the same, regardless of which HADR option you choose. Sitecore Managed Cloud continually checks the health of your primary data center Content Delivery role by pinging it from five different data centers around the globe. If three of the five data centers report an issue, then an alert is sent to the Managed Cloud Operations team.

The team will investigate the Sitecore environment in the primary data center to ensure that it is actually an issue and not a false-positive, then they will carry out the following validation checks on the primary data center:

  • Check for alerts raised by the Azure Resources used by the Sitecore site.

  • Check to see if the traffic manager is reporting a degraded endpoint.

  • Check the Azure status site for known database issues.

If the Operations team deems that there is an unrecoverable issue in part, or all, of the underlying infrastructure of the primary data center, they will contact you and begin the failover process.

A typical Sitecore environment is comprised of five Azure resource types: App Services, Azure SQL, Application Insights, Azure Search, Redis Cache. Sitecore ensures the sizes and instance counts for all of the resources are replicated to a secondary data center. However, only App Services and Azure SQL have their files/data backed up and restored. The other services do not have their data replicated because it is either transient or not required for successful restoration, specifically:

  • App Services – All files/data are backed up and restored.

  • Azure SQL – All files/data are backed up and restored.

  • Azure Search – Data is not replicated because Azure Search does not provide a way to backup or geo-replicate indexes. This means Sitecore indexes are rebuilt in the secondary data center instead.

  • Redis Cache – Data is not replicated because Redis Cache contains user session data that typically expires before a Sitecore site can be restored, therefore it is not included as part of the disaster recovery strategy.

  • Application Insights – Data is not replicated because Application Insights only contains health monitoring data and this is not required for the runtime of the Sitecore site.