Split or merge xDB Collection database shards
In Sitecore 9.3 and later, you can use Microsoft's split/merge service to redistribute data between shards.
Limitations
In Sitecore 9.3, the split/merge service has the following limitations:
-
It can only be used with SQL Azure.
-
It cannot be used in combination with Always Encrypted. You must decrypt your data before running the split or merge operations.
-
The ShardKey field in the xDB Collection database must have a value. You must use the xDB Collection update tool to update data collected before Sitecore 9.3.
Split a shard
The split operation moves records with a specified shard key range from shard A to shard B. The split operation:
-
Can only split one shard at a time.
-
Can only split one shard map at a time. There are 3 shard maps:
-
DeviceProfileIdShardMap
-
ContactIdShardMap
-
ContactIdentifiersIndexShardMap
NoteYou do not need to split all shard maps. You can choose to split
ContactIdShardMap
only. -
To split three shards, including all shard maps, you must perform 12 operations: add 3 shards and split each shard 3 times - once for each shard map.
To run the split operation:
-
Deploy the Azure split/merge service. The split/merge service includes two web apps, a storage account, and a database (unrelated to Sitecore roles).
For information about self-signed certificate creation when splitting or merging shards, refer to the knowledge base.
-
Disable xConnect to avoid data loss or corruption. To disable xConnect:
-
Put Content Delivery and Content Management roles into CMS-Only mode.
-
Stop all web apps that write to xConnect.
-
Stop the xConnect Collection and xConnect Collection Search service web apps.
-
-
Access your xConnect web application's Kudu service URL. For example:
https://mysite.scm.azurewebsites.net/
-
Browse to the folder that contains the SQL Sharding Deployment Tool:
<xconnect-root>\App_Data\collectiondeployment
NoteIf any of the following SQL Sharding Deployment Tool scripts do not work with Kudu, download the full folder locally, and run the SQL Sharding Deployment Tool scripts from local Powershell with admin permissions. Ensure that the Azure Firewall is set correctly to allow access to the Azure SQL instance from the client that the folder has been downloaded and from where the scripts are going to be run.
-
Set the SQL user to be equivalent to sa user with full Administrator permissions for all SQL Sharding Deployment Tool scripts.
-
Use the The SQL Sharding Deployment Tool to add an empty shard to the shard cluster. For example:
RequestResponse.\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation “addShard” /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb" /shardConnectionString "user id=sa;password=12345;data source=.\SQL2017;" /shardnameprefix "Sitecore.Xdb.Collection.Database.Sql" /shardnamesuffix "-db" /dacpac ".\Sitecore.Xdb.Collection.Database.Sql.dacpac"
-
Add the xcsmmuser user and its permissions using the following SQL script in the created shard.
RequestResponseDROP USER IF EXISTS xcsmmuser GO CREATE USER xcsmmuser WITH PASSWORD = 'PasswordHere'; GO EXEC sp_addrolemember 'db_datareader',xcsmmuser; EXEC sp_addrolemember 'db_datawriter',xcsmmuser; GO GRANT EXECUTE TO xcsmmuser; GO
You can find the password for the xcsmmuser in the connection strings (path:
<xconnect-root>\App_Config\ConnectionStrings.config)
. -
Optionally, to improve the performance of the split/merge operation, create indexes on the
[ShardKey]
column of the following tables of each shard:-
Contacts
-
ContactIdentifiers
-
ContactFacets
-
Interactions
-
InteractionFacets
-
DeviceProfiles
-
DeviceProfileFacets
-
ContactIdentifierIndex
-
-
Choose a shard to split and decide where split it (for example, you can choose to do a 50/50 split or a 75/25 split in terms of shard key range). To do this, use the SQL Sharding Deployment tool
printMapping
operation to get the shard key range of all shard maps:RequestResponse.\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation printMapping /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb"
The
printMapping
command generates a list of shard maps and their key ranges:RequestResponseCurrent Shard Map ContactIdShardMap state: Demo_Xdb.Collection.Shard0 contains key range [0x:0x80) Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf) Current Shard Map DeviceProfileIdShardMap state: Demo_Xdb.Collection.Shard0 contains key range [0x:0x80) Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf) Current Shard Map ContactIdentifiersIndexShardMap state: Demo_Xdb.Collection.Shard0 contains key range [0x:0x80) Demo_Xdb.Collection.Shard1 contains key range [0x80:+inf)
Note+inf
means0xFF
(highest key).In this example, the shard key range of the
ContactIdShardMap
shard map in theDemo_Xdb.Collection.Shard0
shard is 0x:0x80. To split theContactIdShardMapshard
50/50, split the shard at the 0x40 key. To split the shard 25/75,split the shard at the 0x20 or 0x60 range.NoteA 50/50 split means a 50/50 split in terms of shard key range. The distribution of data across shards might not be entirely uniform.
-
Browse to the split/merge service web page and run the split/merge operation. Provide the correct values in the fields and set the Shard Key Type as a binary as shown in the following image.
When filling in the form:
-
The Split at Key is the key that you chose in step 7. For example, if the key range is 0x:0x80, specify 0x20 to split the shard 25/75.
NoteYou choose a shard by key range, not name. The tool works out which shard to target based on key value you choose to split at.
-
The Split Behavior option determines which key range moves to the new shard. For example, if you split at 0x20 and choose Move range from low key (incl.) to split key (excl.), the range of the new shard will be 0x:0x19.
-
-
Click Submit to run the tool.
-
When the split is complete, drop indexes for
[ShardKey]
columns if you created them in step 6.
Rebuild the xDB index after performing the split operation.
Merge shards
To run the merge operation:
-
Deploy the Azure split/merge service. The split/merge service includes two web apps, a storage account, and a database (unrelated to Sitecore roles).
-
Disable xConnect to avoid data loss or corruption. To disable xConnect:
-
Put Content Delivery and Content Management roles into CMS-Only mode.
-
Stop all web apps that write to xConnect.
-
Stop the Connect Collection and xConnect Collection Search service web apps.
-
-
Access your xConnect web application's Kudu service URL. For example:
https://mysite.scm.azurewebsites.net/
-
Browse to the folder that contains the SQL Sharding Deployment Tool:
<xconnect-root>\App_Data\collectiondeployment
NoteIf any of the following SQL Sharding Deployment Tool scripts do not work with Kudu, download the full folder locally, and run the SQL Sharding Deployment Tool scripts from local Powershell with admin permissions. Ensure that the Azure Firewall is set correctly to allow access to the Azure SQL instance from the client that the folder has been downloaded and from where the scripts are going to be run.
-
Set the SQL user to be equivalent to sa user with full Administrator permissions for all SQL Sharding Deployment Tool scripts.
-
Optionally, to improve the performance of the merge operation, create indexes on the
[ShardKey]
column of the following tables of each shard:-
Contacts
-
ContactIdentifiers
-
ContactFacets
-
Interactions
-
InteractionFacets
-
DeviceProfiles
-
DeviceProfileFacets
-
ContactIdentifierIndex
-
-
Choose shards to merge. To do this, use the SQL Sharding Deployment tool
printMapping
operation to get the shard key range of all shard maps. For example:RequestResponse.\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation printMapping /connectionstring "user id=sa;password=12345;data source=(server);" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb"
The following example shows the output of the
printMapping
operation.RequestResponseCurrent Shard Map ContactIdShardMap state: Demo_Xdb.Collection.Shard0 contains key range [0x:0x40) Demo_Xdb.Collection.Shard1 contains key range [0x40:0x80) Demo_Xdb.Collection.Shard2 contains key range [0x80:+inf)
Note+inf
means0xFF
(highest key).In this example, we want to merge shard key range 0x:0x40 (
Demo_Xdb.Collection.Shard0
) into shard key range 0x40:0x80 (Demo_Xdb.Collection.Shard1
). -
Browse to the split/merge service web page and run the split/merge operation. Provide the correct values in the fields and set the Shard Key Type as a binary as shown in the following image:
If you want to merge shard range 0x:0x40 into shard range 0x40:0x80:
-
The Source Range Low Key is 0x.
-
The Source Range High Key 0x40 or select High is max key. High max key is considered as inf.
-
The Target Range Low Key 0x40.
-
The Target Range High Key is 0x80 or tick High is max key. High max key is considered as inf.
When the merge is complete,
Demo_Xdb.Collection.Shard1
will contain key range 0x:0x80. -
-
Use the SQL Sharding Deployment Tool to delete the empty source shard. For example:
RequestResponse.\Sitecore.Xdb.Collection.Database.SqlShardingDeploymentTool.exe /operation deleteShard /connectionstring "user id=sa;password=12345;data source=.\SQL2017;" /shardMapManagerDatabaseName "Sitecore.Xdb.Collection.Database.Sql.ShardMapManagerDb" /shardConnectionString "user id=sa;password=12345;data source=.\SQL2017;" /shardDatabaseName "Sitecore.Xdb.Collection.Database.Sql2_"
-
Drop indexes for
[ShardKey]
columns if you created them in step 6.
Rebuild the xDB index after performing the merge operation.