Platform Administration and Architecture

Split or merge xDB Collection database shards

Abstract

How to split or merge the xDB Collection database shards.

In Sitecore 9.3 and later, you can use Microsoft's split/merge service to re-distribute data between shards.

Limitations

In Sitecore 9.3, the split/merge service has the following limitations:

  • It can only be used with SQL Azure.

  • It cannot be used in combination with Always Encrypted. You must decrypt your data before executing the split or merge operations.

  • The ShardKey field in the xDB Collection database must have a value. You must use the xDB Collection update tool to update data collected before Sitecore 9.3.

Execute the split operation

The split operation moves records with a specified shard key range from shard A to shard B. The split operation:

  • Can only split one shard at a time.

  • Can only split one shard map at a time. There are 3 shard maps:

    • DeviceProfileIdShardMap

    • ContactIdShardMap

    • ContactIdentifiersIndexShardMap

    Note

    You do not need to split all shard maps. You can choose to split ContactIdShardMap only.

To split three shards, including all shard maps, you must perform 12 operations: add 3 shards and split each shard 3 times - once for each shard map.

To execute the split operation:

  1. Deploy the Azure split/merge service. The split/merge service includes a web app and a database (unrelated to Sitecore roles).

  2. Disable xConnect to avoid data loss or corruption. To disable xConnect:

    • Put Content Delivery and Content Management roles into CMS-Only mode.

    • Stop all web apps that write to xConnect.

    • Stop the xConnect Collection and xConnect Collection Search service web apps.

  3. Access your xConnect web application's Kudu service URL. For example: https://mysite.scm.azurewebsites.net/

  4. Browse to the folder that contains the SQL Sharding Deployment Tool: <xconnect-root>\App_Data\collectiondeployment

  5. Use the SQL Sharding Deployment Tool to add an empty shard to the shard cluster. For example:

    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation addShard /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb" /dbedition "Basic" /shardconnectionstring "user id=sa;password=12345;data source=.\SQL2017;" /shardnameprefix "Sitecore.Xdb.Collection.Database.Sql" /shardnamesuffix "_" /dacpac ".\Sitecore.Xdb.Collection.Database.Sql.dacpac"
  6. Optionally, create indexes on the [ShardKey] column of the following tables of each shard to improve the performance of the split/merge operation:

    • Contacts

    • ContactIdentifiers

    • ContactFacets

    • Interactions

    • InteractionFacets

    • DeviceProfiles

    • DeviceProfileFacets

    • ContactIdentifierIndex

  7. Choose a shard to split and where split it (for example, you can choose to do a 50/50 split or a 75/25 split in terms of shard key range). To do this, use the SQL Sharding Deployment tool printMapping operation to get the shard key range of all shard maps:

    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation printMapping /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb"	

    The printMapping command generates a list of shard maps and their key ranges:

    Current Shard Map ContactIdShardMap state:
        Demo_Xdb.Collection.Shard0 contains key range [0x:0x80)
        Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)
    Current Shard Map DeviceProfileIdShardMap state:
        Demo_Xdb.Collection.Shard0 contains key range [0x:0x80)
        Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)
    Current Shard Map ContactIdentifiersIndexShardMap state:
        Demo_Xdb.Collection.Shard0 contains key range [0x:0x80)
        Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)                            

    Note

    +inf means 0xFF (highest key).

    In this example, the shard key range of the ContactIdShardMap shard map in the Demo_Xdb.Collection.Shard0 shard is 0x:0x80. To split the ContactIdShardMapshard 50/50, split the shard at the 0x40 key. To split the shard 25/75,split the shard at the 0x20 or 0x60 range.

    Note

    A 50/50 split means a 50/50 split in terms of shard key range. The distribution of data across shards might not be entirely uniform.

  8. Browse to the split/merge service web page and execute the split operation. The following screenshot shows the UI for the Split operation:

    The web UI of the split/merge service.

    When filling in the form:

    • The Split at Key is the key that you chose in step 7. For example, if the key range is 0x:0x80, specify 0x20 to split the shard 25/75.

      Note

      You choose a shard by key range, not name. The tool works out which shard to target based on key value you choose to split at.

    • The Split Behavior option determines which key range moves to the new shard. For example, if you split at 0x20 and choose Move range from low key (incl.) to split key (excl.), the range of the new shard will be 0x:0x19.

  9. Click Submit to run the tool.

  10. When the split is complete, drop indexes for [ShardKey] columns if you created them in step 7.

Execute the merge operation

To execute the merge operation:

  1. Deploy the Azure split/merge service. The split/merge service includes a web app and a database (unrelated to Sitecore roles).

  2. Disable xConnect to avoid data loss or corruption. To disable xConnect:

    • Put Content Delivery and Content Management roles into CMS-Only mode.

    • Stop all web apps that write to xConnect.

    • Stop the Connect Collection and xConnect Collection Search service web apps.

  3. Access your xConnect web application's Kudu service URL. For example: https://mysite.scm.azurewebsites.net/

  4. Browse to the folder that contains the SQL Sharding Deployment Tool: <xconnect-root>\App_Data\collectiondeployment

  5. Optionally, create indexes on the [ShardKey] column of the following tables of each shard to improve the performance of the merge operation:

    • Contacts

    • ContactIdentifiers

    • ContactFacets

    • Interactions

    • InteractionFacets

    • DeviceProfiles

    • DeviceProfileFacets

    • ContactIdentifierIndex

  6. Choose shards to merge. To do this, use the SQL Sharding Deployment tool printMapping operation to get the shard key range of all shard maps. For example:

    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation printMapping /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb"	

    The following example shows the output of the printMapping operation.

            Current Shard Map ContactIdShardMap state:
                    Demo_Xdb.Collection.Shard0 contains key range [0x:0x40)
                    Demo_Xdb.Collection.Shard1 contains key range [0x40:0x80)
                    Demo_Xdb.Collection.Shard2 contains key range [0x80:+inf)
            

    Note

    +inf means 0xFF (highest key).

    In this example, we want to merge shard key range 0x:0x40 (Demo_Xdb.Collection.Shard0) into shard key range 0x40:0x80 (Demo_Xdb.Collection.Shard1).

  7. Browse to the split/merge service web page and execute the split operation. The following screenshot shows the UI for the Merge operation:

    microsoft-split-merge-ui-web.png

    If you want to merge shard range 0x:0x40 into shard range 0x40:0x80:

    • The Source Range Low Key is 0x.

    • The Source Range High Key 0x40 or tick High is max key.

    • The Target Range Low Key 0x40.

    • The Target Range High Key is 0x80 or tick High is max key.

    When the merge is complete, Demo_Xdb.Collection.Shard1 will contain key range 0x:0x80.

  8. Use the SQL Sharding Deployment Tool to delete the empty source shard. For example:

    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation deleteShard /connectionstring "user id=sa;password=12345;data source=.\SQL2017;" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb" /shardConnectionString "user id=sa;password=12345;data source=.\SQL2017;" /shardDatabaseName "Sitecore.Xdb.Collection.Database.Sql2_"
  9. Drop indexes for [ShardKey] columns if you created them in step 7.