Configure batch aggregation

Current version: 9.1

Batch aggregation enables you to group interactions together into batches to improve the performance of your aggregation processing framework. In the Sitecore Experience Database (xDB), batch aggregation is part of the standard installation, but you can also change the default settings depending on your specific hardware and solution architecture. You can test the performance of your batch aggregation configuration using the performance counters in this topic.

You should change the default batch aggregation configuration if you want to:

  • Improve the performance of batch aggregation to suit your solution.

  • Enable batch aggregation to use your own custom aggregations.

To change the default batch aggregation configuration you need to edit the Sitecore.Analytics.Processing.Aggregation.Services.config file.

Note

Do not make changes directly to the configuration file, but instead create your own custom configuration patch file that performs the required changes during run time.

Batch aggregation configuration settings

The following settings enable you to adjust the configuration of batch aggregation to improve the performance of your Sitecore solution during aggregation processing.

Configuration node

Description

MaximumBatchSize

The MaximumBatchSize setting enables you to control the maximum number of interactions to include in a single batch.

Adjust this setting according to the performance of your individual Sitecore solution. Consider the following factors when you set the maximum batch size:

  • The number of SQL Server disks you have.

  • Processing power (type and number of processors).

  • Network latency.

  • The specific configuration of your aggregation/processing server.

The default setting is 64 interactions: <paramdesc="maximumBatchSize"type="Sitecore.Analytics.Core.ConfigurationHelper,Sitecore.Analytics.Core"factoryMethod="ToShort"arg0="64"/>

MultiplexingTimeout

Multiplexing collapses multiple aggregation threads into a single transaction just before commit to make processing faster and to reduce the load on SQL Server.

The multiplexing time out setting enables you to specify the maximum time the multiplexer can wait for parallel executing aggregation agents before storing the data set in the reporting database.

Reasons to change the multiplexing timeout include:

  • Network latency between the Processing server and SQL server.

  • Number of Processing agents.

  • Load on SQL Server

Setting the value to 1 second (default value) tells the system to save the combined results produced by aggregation agents each second.The default setting is 1 second: <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>

Aggregator

Replace the standard implementation of the batch aggregator with your own custom implementation.

RequestResponse
<Aggregator>
<MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>
</Aggregator>

Table-valued parameters (TVP)

Sitecore uses table-valued parameters (TVP) to send aggregated information to SQL server.

For information on how to use table-valued parameters read the Microsoft article Use Table-Valued Parameters (Database Engine).

Using TVP provides a faster way of communicating with SQL Server. It enables you to pass a table as a parameter to a function or stored procedure after the data from the collection database has been grouped together. The use of TVP is optional and is implemented separately for each fact and dimension table.

The SQL table mappings in the Sitecore configuration file have the following properties to support TVP:

  • IsMultiRow – shows whether the stored procedure accepts TVP.

  • TableType – the name of the SQL Server user-defined type that is accepted by the stored procedure. This type must correspond to the structure of the table. It must have the same columns in the same order and use the same types. For more information on user-defined types, read the Microsoft article Working with User-Defined Types in SQL Server.

SQL table mappings are located at the following path: \sitecore\aggregation\routines\ExecRoutineStatementBuilder\mappings and defined in the Sitecore.Analytics.Processing.Aggregation.config file.

It is recommended that you create custom aggregations that support for TVP. See the example for a description of how to configure Sitecore to do this.

Configuration patch file examples

You can customize the following examples to change your default batch aggregation configuration.

Example 1: Changing the default configuration settings

This is an example of a configuration patch file that you can use to change the default batch aggregation configuration file settings. It shows how to change the MaximumBatchSize, MultiplexingTimeout, Interval and MaxThreads parameters for both the live and history aggregators.

RequestResponse
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
    <aggregation>
    <!-- Configure the aggregator agent: -->
    <aggregator>
            <param desc="aggregator" type="Sitecore.Analytics.Aggregation.InteractionBatchAggregator, Sitecore.Analytics.Aggregation" singleInstance="true">
                <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>
            </param>
            <param desc="dateTimeStrategy" ref="aggregation/dateTimePrecisionStrategy" />
    </aggregator>
    <!-- Configure the historyWorker agent: -->
    <historyWorker>
            <param desc="aggregator" type="Sitecore.Analytics.Aggregation.InteractionBatchAggregator, Sitecore.Analytics.Aggregation" singleInstance="true">
                <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>
            </param>
            <param desc="maximumBatchSize" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToShort" arg0="128" />
    </historyWorker>
    <!-- Aggregation Module: -->
        <module type="Sitecore.Analytics.Aggregation.AggregationModule" singleInstance="true">
            <BackgroundServices hint="list:Add">
                <aggregator type="Sitecore.Analytics.Core.AsyncBackgroundService">
                    <param desc="interval" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToTimeSpan" arg0="0.00:00:15" />
                    <param desc="maxAgents" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToInt" arg0="1" />
                </aggregator>
                <contactProcessing type="Sitecore.Analytics.Core.AsyncBackgroundService">
                    <param desc="interval" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToTimeSpan" arg0="0.00:00:15" />
                    <param desc="maxAgents" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToInt" arg0="1" />
                </contactProcessing>
                <rebuild type="Sitecore.Analytics.Core.AsyncBackgroundService">
                    <param desc="interval" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToTimeSpan" arg0="0.00:00:15" />
                    <param desc="maxAgents" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToInt" arg0="1" />
                </rebuild>
                <history type="Sitecore.Analytics.Core.AsyncBackgroundService">
                    <param desc="interval" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToTimeSpan" arg0="0.00:00:15" />
                    <param desc="maxAgents" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToInt" arg0="1" />
                </history>
                <historyCompletionCheck type="Sitecore.Analytics.Core.AsyncBackgroundService">
                    <param desc="interval" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToTimeSpan" arg0="0.00:00:15" />
                    <param desc="maxAgents" type="Sitecore.Analytics.Core.ConfigurationHelper, Sitecore.Analytics.Core" factoryMethod="ToInt" arg0="1" />
                </historyCompletionCheck>
            </BackgroundServices>
        </module>
  </aggregation>
</sitecore>
</configuration>

Example 2: Enabling TVP support for a custom aggregation

This is a code sample that you can use to create your own configuration file to enable a custom aggregation to work with batch aggregation. It shows how to integrate a custom processor into the aggregation pipeline for interactions and how to configure mappings for a custom fact table with support for SQL Server TVP.

RequestResponse
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
    <aggregation>
    <routines>

        <ExecRoutineStatementBuilder>
        <mappings>
            <!-- Reference to the fact table and associated stored procedure and table type -->
            <SqlMappingEntity uid="MyCustomFact"
            type="Sitecore.Analytics.Aggregation.SqlMappingEntity, Sitecore.Analytics.Sql">
            <Table>Fact_CustomFact</Table>
            <Routine>Add_CustomFact_Tvp</Routine>
            <IsMultiRow>True</IsMultiRow>
            <TableType>CustomFact_Type</TableType>
            </SqlMappingEntity>
        </mappings>
        </ExecRoutineStatementBuilder>
    </routines>
    </aggregation>
    <pipelines>
    <group groupName="analytics.aggregation">
        <pipelines>
        <interactions>
            <!-- Registration of the processor that will write to the new fact table -->
            <processor uid="MyCustomFact" type="Customization.CustomProcessor, Customization" />
        </interactions>
        </pipelines>
    </group>
    </pipelines>
  </aggregation>
</sitecore>
</configuration>

Batch aggregation performance counters

If you want to adjust the performance of your batch aggregation processing configuration, or you want to monitor and troubleshoot the aggregation process, you can use the following counters to measure performance.

Counter

Description

Average batch size – live

Displays the average number of interactions contained in each batch processed during live aggregation.

Average batch size – history

Displays the average number of interactions contained in each batch processed during history aggregation. Under normal operation , this counter is equal to MaxBatchSize for most of the time while the history aggregation is active.

Number of batches containing failing items – live

Displays the number of live aggregation batches (per second) containing interactions that have failed to be aggregated or saved to the reporting database. Under normal operation, this counter is always zero.

Number of batches containing failing items – history

Displays the number of history aggregation batches (per second) containing interactions that have failed to be aggregated or saved to the reporting database. Under normal operation, this counter is always zero.

Live Interactions Processed / sec

The number of interactions processed when updating the primary reporting database.

History Interactions Processed / sec

The number of aggregations processed when rebuilding the secondary reporting database.

Total Interactions Processed / sec

The total number of interactions processed by the aggregation subsystem.

Live Aggregation Errors / sec

The number of interactions which could not be aggregated due to an exception, when updating the primary reporting database.

History Aggregation Errors / sec

The number of interactions which could not be aggregated due to an exception, when updating the secondary reporting database.

Total Aggregation Errors / sec

The total number of interactions which could not be aggregated due to an exception.

Average Entity Batch Load Time - Live (ms)

The average time it takes to fetch a batch of entities from XConnect.

Entity Load Failures - Live / sec

The number of entities that failed to be retrieved from XConnect.

Average Aggregation Pipeline Time (ms)

Average time it takes to aggregate a single interaction (including loading all required information, but not including time to write to reporting databases.)

Average Write Time (ms)

Average time it takes to save aggregated data to the reporting database.

Average Write Time - Primary (ms)

Average time in milliseconds it takes to submit an aggregated interaction to the primary reporting database

Average Write Time - Secondary, Live (ms)

Average time in milliseconds it takes to submit an aggregated interaction to the secondary reporting database, when updating the reporting databases with incoming interactions.

Average Write Time - Secondary, History (ms)

Average time in milliseconds it takes to submit an aggregated interaction to the secondary reporting database, when writing historical interactions during database rebuild process.

Active Interaction Aggregators

The number concurrent threads running the aggregation pipeline.

Average Check Out Time - Live (ms)

Average time it takes to obtain a work item from a processing pool while processing live visits.

Average Check In Time - Live (ms)

Average time it takes to mark a work item as completed in a processing pool while processing live visits.

Average Check Out Time - History (ms)

Average time it takes to obtain a work item from a range scheduler while rebuilding the reporting database.

Average Check In Time - History (ms)

Average time it takes to mark a work item as completed in a range scheduler while rebuilding the reporting database.

Average Multiplexer Time (ms)

Average time it takes for multiplexer to save aggregated data to the reporting database.

Do you have some feedback for us?

If you have suggestions for improving this article,