Batch file formatting requirements
The Batch API supports uploading gzipped archive files (.gz
). The maximum size of the gzipped file is 50MB.
If the size of the gzipped file exceeds the 50MB limit, recompress the original JSON files into two or more gzipped files that do not exceed the 50MB size limit. Then, upload the gzipped files one at a time.
The gzipped file must contain exactly one JSON file. For example, if the data you want to upload is in import.json
, you must gzip the JSON file and upload import.json.gz
.
JSON records in the JSON file
The JSON file must have each JSON record on a separate row. For example, if you are uploading a JSON file containing guest records, each guest you import must have its own row within the JSON file. If you are uploading a JSON file containing order records, each order that you import must also be in a distinct row.
Each JSON record must be:
-
Valid JSON.
-
Minified, contained on a single line.
-
Terminated with a carriage return.
-
Encoded according to RFC 4627.
This sample JSON file (import.json
) contains two JSON records for two guests: [email protected]
and [email protected]
. Email addresses in each record should adhere to the email validation rules. Each record is in a separate row. Each row is valid JSON although the JSON file as a whole is not.
{"ref":"ace73c06-b361-11ed-afa1-0242ac120002","schema":"guest","mode":"upsert","value":{"guestType":"customer","identifiers":[{"provider":"email","id":"[email protected]"}],"extensions":[{"name":"ext","key":"default","loyaltyNumber":"1234"}]}}
{"ref":"7e1d404e-9fd5-4df5-acfe-3a64848ec594","schema":"guest","mode":"upsert","value":{"guestType":"customer","identifiers":[{"provider":"email","id":"[email protected]"}],"extensions":[{"name":"ext","key":"default","loyaltyNumber":"5678"}]}}
To prevent the creation of duplicate records, follow these guidelines:
-
Guest records - ensure that you include a unique guest record only once in the JSON file. Including duplicate guest records in the JSON file will result in the creation of duplicate guests in Sitecore CDP.
-
Order records - similarly, ensure that you include a unique order record only once in the JSON file to prevent creating duplicate orders in Sitecore CDP.
-
Multiple orders for the same guest without a profile in CDP - when uploading multiple orders for a guest without an existing profile in Sitecore CDP, avoid parallel uploads using different threads, as this may create duplicate guests.
For example, if a JSON file includes the following guest records, where the first and last records are duplicates, the identical records will not be merged and will result in the creation of two guests.
{"ref":"ace73c06-b361-11ed-afa1-0242ac120002","schema":"guest","mode":"upsert","value":{"guestType":"customer","identifiers":[{"provider":"email","id":"[email protected]"}],"extensions":[{"name":"ext","key":"default","loyaltyNumber":"1234"}]}}
{"ref":"7e1d404e-9fd5-4df5-acfe-3a64848ec594","schema":"guest","mode":"upsert","value":{"guestType":"customer","identifiers":[{"provider":"email","id":"[email protected]"}],"extensions":[{"name":"ext","key":"default","loyaltyNumber":"5678"}]}}
{"ref":"ace73c06-b361-11ed-afa1-0242ac120002","schema":"guest","mode":"upsert","value":{"guestType":"customer","identifiers":[{"provider":"email","id":"[email protected]"}],"extensions":[{"name":"ext","key":"default","loyaltyNumber":"1234"}]}}
Similarly, if a JSON file includes two order records with same order reference, there will be two order records generated in the system.
Each JSON record must include the following required fields:
Field name |
Type |
Description |
---|---|---|
|
string Version 4 UUID |
A unique ID that identifies the JSON record during the upload process. This ID is not used inside Sitecore CDP to identify the guest or the record; it is for the upload process only. |
|
string Must be one of:
|
The type of record contained in the |
|
string Must be one of:
|
Indicates how to import the record into the system:
|
|
JSON object |
Contains the Sitecore CDP entity. |
Not all fields for each object are required to be present during the upload. The exception is for guest data extensions, where depending on the type of guest data extension, all fields might be required.
When using upsert
mode for a guest, you only need to include enough data to identify the guest within Sitecore CDP, as well as the fields to be updated.
When using guest
mode for a guest, all fields required to deem the entity valid must be included because the update must result in a valid entity.
Email validation rules
Each email address in the JSON record of a batch import file must meet the following validation criteria:
-
The name and domain parts of the email address are validated independently using separate Regular Expressions (RegEx).
-
The email address must contain an @ symbol between the name and the domain. For example:
[email protected]
-
Neither the name or domain can begin or end with a period (.). For example,
[email protected]
and[email protected]
are invalid, but[email protected]
is valid. -
The name and domain can include letters (a-z), numbers (0-9), and the following special characters: !, #, $, %, &, ', *, +, /, =, ?, ^, _, `, {, |, }, ~, and -.
-
The domain must either be a valid domain name or an IP address. For example,
[email protected]
,[email protected]
.