Data extraction
How to use the data extraction function of the xConnect client API, which allows you to export all contacts and interactions, or a subset, for use in a third-party application.
Data extraction is a function of the xConnect Client API that allows all or a subset of contacts and interactions to be exported for use in a third party application. The process of rebuilding the reporting database (also known as history aggregation) uses the data extraction feature.
Data extraction in a multi-shard environment
Data extraction uses a round-robin strategy to call shards.
Shards are requested one by one. If a shard has 500 records and the requested batch size is 1000, only 500 records are returned.
If a shard does not have any relevant data, it is skipped.
In Sitecore 9.2 and later, it is possible to delete contacts and their interactions. If the result of a read operation is 0 records and there are
The following example demonstrates the process of data extraction in an environment with three shards, and a requested batch size of 1000.
Before data extraction begins, data is distributed across the shards as follows:
First read cursor operation (starting at Shard 1): Shard 1 (1000) = 1000 records returned:
Second read cursor operation (start at Shard 2): Shard 2 (500) = 500 records returned:
Note
Shard 3 is not requested to fill the batch - only 500 records are returned.
Third read cursor operation (start at Shard 3): Shard 3 (1000) = 1000 records returned:
Fourth read cursor operation (start at Shard 3): Shard 3 (200) = 200 records returned:
At this point, data extraction is complete.