Managing the size of your reporting data

Abstract

The Experience Analytics data reducer helps you manage the size of your reporting data. Learn about the configuration options for the reducer.

During data aggregation, Experience Analytics collects facts about each group of each dimension for all the interactions on your websites. These collected facts are stored in fact tables in the reporting database for tracking purposes. The volume of reporting data can become very large depending on the number of visitors to the website. To reduce data volume, the reducer combines the least significant groups that it collects each day into a single group called other. This makes data consumption and read performance more predictable.

Reducer configuration

The reducer chooses the least significant records and collects those records under the other key. It chooses records using the following configurable parameters, which you can find in the Sitecore.ExperienceAnalytics.Reduce.config file:

Name

Description

Default Value

DefaultKeepCountThreshold

The number of records to keep (for example, if the default of 1000 is used and there are 3000 records, the reducer consolidates 2000 records and leaves 1000 intact).

If the number of records for a single dimension is fewer than this threshold, the reducer does not reduce anything.

1000

DefaultValueThreshold

The minimum value that the Value metric must meet for a record to be considered significant. The reducer consolidates records that do not meet the minimum value.

-1 (Disabled)

DefaultVisitThreshold

The minimum value that the Visits metric must have for a segment record to be considered significant. The reducer consolidates records that do not meet this minimum value.

10

DefaultRetentionPeriod

The amount of time that data is kept before the reducer consolidates it.

7.00:00:00

DefaultOrderBy

ONLY AFFECTS FLEXIBLE DIMENSIONS

The order in which the reducer sorts records to determine relative significance.

You can specify any metric that you want to use as an indicator of significance, for example, visits, value, timeOnSite, and so on.

Visits, abs(Value)

Timeout

The maximum amount of time the reducer works. After this time, the reducer stops.

01:00:00

For example, if we look at the pages table and 10,000 unique pages are visited in a single day, the reducer by default only keeps facts about the 1,000 most significant pages (DefaultKeepCountThreshold), and reduces the facts for the 9,000 less significant pages. The reducer processes the 10,000 visits after the retention period, which is 7 days by default.

The reducer and legacy dimensions

If you are using legacy dimensions, the reducer works as follows:

  • If the visit count for a page is less than the DefaultVisitThreshold, the page is considered insignificant.

  • If the value for a page is less than the DefaultValueThreshold, the page is considered insignificant.

  • If you have more than 1000 pages that are significant, the reducer orders them by Visits and then by Value and then consolidates the records for all pages except the top 1000.

The reducer and flexible dimensions

If you are using flexible dimensions, the reducer works as follows:

  • If you have more than 1000 pages that are significant, the reducer orders them by DefaultOrderBy and then consolidates the records for all pages except the top 1000.