Data feed file format and structure
This topic describes how to format and structure your data in a way that Sitecore Discover can process. Use the information in this topic to create valid data feed files.
File naming convention
Sitecore Discover data feed files require a name format. The naming convention for data feed files is as follows:
Data feed type |
File format |
---|---|
Product data feed |
|
Cross-category feed |
|
Category data feed |
|
Fitment data feed |
|
Sales data |
|
Encoding attribute values
Discover data feed files require UTF-8 text encoding.
Some common characters that must be encoded include quotes, most accents, and most characters that are not supported in ASCII. Refer to the UTF-8 standard for details.
Using escape characters
Use a backslash \
character in escape character sequences.
All reserved characters included in attribute values such as new line, tab, column separator, list separator, quotes, and characters such as #
, <
, >
;
, '
, |
, /
, :
, and so on, must be escaped.
Description values often contain tab and new line characters and missing escaping sequences in descriptions are a common cause of errors during data feed. Make sure to escape characters where required to prevent errors from occurring during the data feed process.
HTML tags
HTML tags are not supported. Do not include HTML tags in a data feed file. Discover does not extract them.
Separators
The following table lists separators you use to separate specific elements in data feed files.
To separate |
Use |
Example |
---|---|---|
Columns |
Tab |
|
Items within list |
Comma ( |
|
Rows |
New line | |
Element values if value represents an object with multiple elements |
Semicolon ( |
|
An element value that is a list within an object |
Number symbol ( |
Where fitment = a list of
|
Categories in a category hierarchy |
Greater than symbol ( |
|
A list of store or reseller ID and value pairs (for store or reseller-specific values) |
Colon ( |
For example, following is a list of store ID and price values:
|
Hierarchies in a two-level hierarchy such as in a group of store or reseller IDs, followed by a value pair. |
Slash ( |
For example, following are examples of lists with two-level hierarchies, one with group IDs, and the other with store IDs, each showing price override values:
|
Guidelines for defining attributes as column names
In your data feed file, each product attribute is represented by a column name.
Separate columns using tab.
The following rules apply when defining product attributes as column names in data feed files:
During the data process, Discover derives attribute names from column names. If column names in your data files do not meet the following rules, they might be transformed during the data feed process.
-
Attribute names can only contain letters, numbers, and underscores characters.
-
Attribute names must start with a lowercase letter.
NoteWe recommend using snake case in attribute names, that is, lowercase only, with words separated by underscores. For example:
color_display_name
. -
Attribute names must end with a letter or a number.
Reserved attribute prefixes and names
Sitecore reserves some attributes prefixes and names that Discover uses for internal processing. Do not include them your data feeds.
Sitecore reserves the following attribute prefixes:
-
rfk_
-
r_
-
stats_
Sitecore reserves the following attribute names:
-
id
-
skuid
-
children
-
swatch
-
swatch2
-
fitment
-
is_disabled
Maximum attribute value length
When defining attribute values of type string, make sure that your strings are short and meaningful. Typical attribute values rarely exceed 128 bytes in size.
The maximum value for attribute values in a data feed is 16 KB in size.
Discover computes availability, locale, and override attribute values after splitting them with the delimiters. The size limit applies to the segments separated by the delimiters.