Split or merge xDB Collection database shards

Version: 10.1

In Sitecore 9.3 and later, you can use Microsoft's split/merge service to redistribute data between shards.

Limitations

In Sitecore 9.3, the split/merge service has the following limitations:

  • It can only be used with SQL Azure.

  • It cannot be used in combination with Always Encrypted. You must decrypt your data before running the split or merge operations.

  • The ShardKey field in the xDB Collection database must have a value. You must use the xDB Collection update tool to update data collected before Sitecore 9.3.

Split a shard

The split operation moves records with a specified shard key range from shard A to shard B. The split operation:

  • Can only split one shard at a time.

  • Can only split one shard map at a time. There are 3 shard maps:

    • DeviceProfileIdShardMap

    • ContactIdShardMap

    • ContactIdentifiersIndexShardMap

    Note

    You do not need to split all shard maps. You can choose to split ContactIdShardMap only.

To split three shards, including all shard maps, you must perform 12 operations: add 3 shards and split each shard 3 times - once for each shard map.

To run the split operation:

  1. Deploy the Azure split/merge service. The split/merge service includes two web apps, a storage account, and a database (unrelated to Sitecore roles).

    For information about self-signed certificate creation when splitting or merging shards, refer to the knowledge base.

  2. Disable xConnect to avoid data loss or corruption. To disable xConnect:

    • Put Content Delivery and Content Management roles into CMS-Only mode.

    • Stop all web apps that write to xConnect.

    • Stop the xConnect Collection and xConnect Collection Search service web apps.

  3. Access your xConnect web application's Kudu service URL. For example: https://mysite.scm.azurewebsites.net/

  4. Browse to the folder that contains the SQL Sharding Deployment Tool: <xconnect-root>\App_Data\collectiondeployment

    Note

    If any of the following SQL Sharding Deployment Tool scripts do not work with Kudu, download the full folder locally, and run the SQL Sharding Deployment Tool scripts from local Powershell with admin permissions. Ensure that the Azure Firewall is set correctly to allow access to the Azure SQL instance from the client that the folder has been downloaded and from where the scripts are going to be run.

  5. Set the SQL user to be equivalent to sa user with full Administrator permissions for all SQL Sharding Deployment Tool scripts.

  6. Use the The SQL Sharding Deployment Tool to add an empty shard to the shard cluster. For example:

    RequestResponse
    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation “addShard” /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb" /shardConnectionString "user id=sa;password=12345;data source=.\SQL2017;" /shardnameprefix "Sitecore.Xdb.Collection.Database.Sql" /shardnamesuffix "-db" /dacpac ".\Sitecore.Xdb.Collection.Database.Sql.dacpac" 
    
  7. Add the xcsmmuser  user and its permissions using the following SQL script in the created shard.

    RequestResponse
    DROP USER IF EXISTS xcsmmuser
    GO
    CREATE USER xcsmmuser WITH PASSWORD = 'PasswordHere';
    GO
    EXEC sp_addrolemember 'db_datareader',xcsmmuser;
    EXEC sp_addrolemember 'db_datawriter',xcsmmuser;
    GO
    GRANT EXECUTE TO xcsmmuser;
    GO 

    You can find the password for the xcsmmuser in the connection strings (path:<xconnect-root>\App_Config\ConnectionStrings.config).

  8. Optionally, to improve the performance of the split/merge operation, create indexes on the [ShardKey] column of the following tables of each shard:

    • Contacts

    • ContactIdentifiers

    • ContactFacets

    • Interactions

    • InteractionFacets

    • DeviceProfiles

    • DeviceProfileFacets

    • ContactIdentifierIndex

  9. Choose a shard to split and decide where split it (for example, you can choose to do a 50/50 split or a 75/25 split in terms of shard key range). To do this, use the SQL Sharding Deployment tool printMapping operation to get the shard key range of all shard maps:

    RequestResponse
    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation printMapping /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb"	

    The printMapping command generates a list of shard maps and their key ranges:

    RequestResponse
    Current Shard Map ContactIdShardMap state:
        Demo_Xdb.Collection.Shard0 contains key range [0x:0x80)
        Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)
    Current Shard Map DeviceProfileIdShardMap state:
        Demo_Xdb.Collection.Shard0 contains key range [0x:0x80)
        Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)
    Current Shard Map ContactIdentifiersIndexShardMap state:
        Demo_Xdb.Collection.Shard0 contains key range [0x:0x80)
        Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)                            
    Note

    +inf means 0xFF (highest key).

    In this example, the shard key range of the ContactIdShardMap shard map in the Demo_Xdb.Collection.Shard0 shard is 0x:0x80. To split the ContactIdShardMapshard 50/50, split the shard at the 0x40 key. To split the shard 25/75,split the shard at the 0x20 or 0x60 range.

    Note

    A 50/50 split means a 50/50 split in terms of shard key range. The distribution of data across shards might not be entirely uniform.

  10. Browse to the split/merge service web page and run the split/merge operation. Provide the correct values in the fields and set the Shard Key Type as a binary as shown in the following image.

    The split/merge web page settings for the Split operation

    When filling in the form:

    • The Split at Key is the key that you chose in step 7. For example, if the key range is 0x:0x80, specify 0x20 to split the shard 25/75.

      Note

      You choose a shard by key range, not name. The tool works out which shard to target based on key value you choose to split at.

    • The Split Behavior option determines which key range moves to the new shard. For example, if you split at 0x20 and choose Move range from low key (incl.) to split key (excl.), the range of the new shard will be 0x:0x19.

  11. Click Submit to run the tool.

  12. When the split is complete, drop indexes for [ShardKey] columns if you created them in step 6.

Important

Rebuild the xDB index after performing the split operation.

Merge shards

To run the merge operation:

  1. Deploy the Azure split/merge service. The split/merge service includes two web apps, a storage account, and a database (unrelated to Sitecore roles).

  2. Disable xConnect to avoid data loss or corruption. To disable xConnect:

    • Put Content Delivery and Content Management roles into CMS-Only mode.

    • Stop all web apps that write to xConnect.

    • Stop the Connect Collection and xConnect Collection Search service web apps.

  3. Access your xConnect web application's Kudu service URL. For example: https://mysite.scm.azurewebsites.net/

  4. Browse to the folder that contains the SQL Sharding Deployment Tool: <xconnect-root>\App_Data\collectiondeployment

    Note

    If any of the following SQL Sharding Deployment Tool scripts do not work with Kudu, download the full folder locally, and run the SQL Sharding Deployment Tool scripts from local Powershell with admin permissions. Ensure that the Azure Firewall is set correctly to allow access to the Azure SQL instance from the client that the folder has been downloaded and from where the scripts are going to be run.

  5. Set the SQL user to be equivalent to sa user with full Administrator permissions for all SQL Sharding Deployment Tool scripts.

  6. Optionally, to improve the performance of the merge operation, create indexes on the [ShardKey] column of the following tables of each shard:

    • Contacts

    • ContactIdentifiers

    • ContactFacets

    • Interactions

    • InteractionFacets

    • DeviceProfiles

    • DeviceProfileFacets

    • ContactIdentifierIndex

  7. Choose shards to merge. To do this, use the SQL Sharding Deployment tool printMapping operation to get the shard key range of all shard maps. For example:

    RequestResponse
    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation printMapping /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb"	

    The following example shows the output of the printMapping operation.

    RequestResponse
            Current Shard Map ContactIdShardMap state:
                    Demo_Xdb.Collection.Shard0 contains key range [0x:0x40)
                    Demo_Xdb.Collection.Shard1 contains key range [0x40:0x80)
                    Demo_Xdb.Collection.Shard2 contains key range [0x80:+inf)
            
    Note

    +inf means 0xFF (highest key).

    In this example, we want to merge shard key range 0x:0x40 (Demo_Xdb.Collection.Shard0) into shard key range 0x40:0x80 (Demo_Xdb.Collection.Shard1).

  8. Browse to the split/merge service web page and run the split/merge operation. Provide the correct values in the fields and set the Shard Key Type as a binary as shown in the following image:

    Split/merge service web page showing the settings for the Merge operaton

    If you want to merge shard range 0x:0x40 into shard range 0x40:0x80:

    • The Source Range Low Key is 0x.

    • The Source Range High Key 0x40 or select High is max key. High max key is considered as inf.

    • The Target Range Low Key 0x40.

    • The Target Range High Key is 0x80 or tick High is max key. High max key is considered as inf.

    When the merge is complete, Demo_Xdb.Collection.Shard1 will contain key range 0x:0x80.

  9. Use the SQL Sharding Deployment Tool to delete the empty source shard. For example:

    RequestResponse
    .\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation deleteShard /connectionstring "user id=sa;password=12345;data source=.\SQL2017;" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb" /shardConnectionString "user id=sa;password=12345;data source=.\SQL2017;" /shardDatabaseName "Sitecore.Xdb.Collection.Database.Sql2_"
  10. Drop indexes for [ShardKey] columns if you created them in step 6.

Important

Rebuild the xDB index after performing the merge operation.

Do you have some feedback for us?

If you have suggestions for improving this article,