Disaster Recovery 2.0 roles and responsibilities
This topic describes the roles and responsibilities for the different stages in the disaster recovery process for PaaS 2.0. Roles and responsibilities are described using the RACI model, where R = Responsible, A = Accountable, C = Consultant, and I = Informed.
Customer-owned Business continuity plan
The Business continuity plan is the responsibility of the customer or partner, with Sitecore providing expertise.
|
Task ID |
Task definition/description |
Sitecore roles |
Customer/Partner roles |
|---|---|---|---|
|
1 |
Selection of Sitecore DR services |
I/C |
R/A |
|
2 |
Creation of an end-to-end business continuity plan |
I/C |
R/A |
|
3 |
End-to-end testing of business continuity plan |
I/C |
R/A |
|
4 |
Execution of business continuity plan |
I/C |
R/A |
Disaster recovery architecture and testing process
Sitecore will test the following items as part of our Disaster Recovery 2.0 service.
|
Task ID |
Task definition/description |
Sitecore roles |
Customer/Partner roles |
|---|---|---|---|
|
1 |
Sitecore DR 2.0 pattern creation |
R/A |
I/C |
|
2 |
Sitecore DR 2.0 pattern maintenance |
R/A |
I/C |
|
3 |
Sitecore DR 2.0 pattern support |
R/A |
I/C |
|
4 |
Sitecore DR 2.0 initial pattern testing and validation |
R/A |
I/C |
|
5 |
Sitecore DR 2.0 initial pattern testing and validation with Sitecore XM v10.3.1 or later, and XP v10.3.1 or later |
R/A |
I/C |
Disaster Recovery end-to-end process
The activities described in this section are intended to guide the Disaster Recovery implementation and failover scenarios.
DR Deployment
Sitecore Managed Cloud provides the following steps as part of the DR 2.0 offering.
|
Task ID |
Task definition/description |
Sitecore roles |
Customer/Partner roles |
|---|---|---|---|
|
1 |
Deploy a new Sitecore environment in the secondary region and scale down the Web Apps and Redis. |
R/A |
I/C |
|
2 |
Sync the data between the primary and secondary Azure SQL using Failover Groups. |
R/A |
I/C |
|
3 |
Sync the sizes/tiers of all Azure resources |
R/A |
I/C |
|
4 |
Sync the file contents of all Web Apps |
R/A |
I/C |
|
5 |
Set up Front Door to switch between primary CD and outage page |
R/A |
I/C |
|
6 |
Set up email alerts to notify the Sitecore Managed Cloud operations team when the availability tests fail |
R/A |
I/C |
DR invocation
The DR failover Initiation process starts when you've notified Sitecore of your intent to perform a complete DR failover to the secondary Azure region.
|
Task ID |
Task definition/description |
Sitecore roles |
Customer/Partner roles |
|---|---|---|---|
|
1 |
Switch Front Door traffic to the maintenance page |
R/A |
C/I |
|
2 |
Disable environments synchronization |
R/A |
C/I |
|
3 |
Perform SQL failover |
R/A |
C/I |
|
4 |
Restore App Service backups |
R/A |
C/I |
|
5 |
Scale up Disaster Recovery environment |
R/A |
C/I |
|
6 |
Verify Disaster Recovery health |
R/A |
C/I |
|
7 |
Switch Front Door traffic to the Disaster Recovery site |
R/A |
C/I |
|
8 |
Perform post-DR recovery updates to the Sitecore application |
C/I |
R/A |
|
9 |
Perform post-DR recovery updates to non-standard Azure services or components |
C/I |
R/A |
DR failback
The failback steps for DR 2.0 are similar to the DR invocation steps noted previously. However, we assume there were no changes to the App services files or configuration during the failover state - that is, there were no code updates or deployments. Therefore, there are no steps to disable environment synchronization, restore app service backups, or to scale up the DR environment.
|
Task ID |
Task definition/description |
Sitecore roles |
Customer/Partner roles |
|---|---|---|---|
|
1 |
Switch Front Door traffic to the maintenance page |
R/A |
C/I |
|
2 |
Perform SQL failover |
R/A |
C/I |
|
3 |
Verify Primary health |
R/A |
C/I |
|
4 |
Switch Front Door traffic to the Production page. |
R/A |
C/I |
|
5 |
Perform post-DR recovery updates to the Sitecore application |
C/I |
R/A |
|
6 |
Perform post-DR recovery updates to non-standard Azure services or components |
C/I |
R/A |
If any changes have been applied to the App services in the disaster recovery region (while failed over), these changes must be reapplied to the App services when the failback process has been completed.