Batch aggregation configuration

Current version: 8.2

Batch aggregation enables you to group interactions together into batches to improve the performance of your aggregation processing framework. In the Sitecore Experience Database (xDB), batch aggregation is part of the standard installation, but you can also change the default settings depending on your specific hardware and solution architecture. You can test the performance of your batch aggregation configuration using the performance counters in this topic.

You should change the default batch aggregation configuration if you want to:

  • Improve the performance of batch aggregation to suit your solution.

  • Enable batch aggregation to use your own custom aggregations.

To change the default batch aggregation configuration you need to edit the Sitecore.Analytics.Processing.Aggregation.Services.config file.

Note

Do not make changes directly to the configuration file, but instead create your own custom configuration patch file that performs the required changes during run time.

Batch aggregation configuration settings

The following settings enable you to adjust the configuration of batch aggregation to improve the performance of your Sitecore solution during aggregation processing.

Configuration node

Description and example

MaximumBatchSize

The MaximumBatchSize setting enables you to control the maximum number of interactions to include in a single batch.

Adjust this setting according to the performance of your individual Sitecore solution. Consider the following factors when you set the maximum batch size:

- The number of SQL Server disks you have.

- Processing power (type and number of processors).

- Network latency.

- The specific configuration of your aggregation/processing server.

The default setting is 64 interactions:

RequestResponse
<MaximumBatchSize>64</MaximumBatchSize>

MultiplexingTimeout

Multiplexing collapses multiple aggregation threads into a single transaction just before commit to make processing faster and to reduce the load on SQL Server.

The multiplexing time out setting enables you to specify the maximum time the multiplexer can wait for parallel executing aggregation agents before storing the data set in the reporting database.

Setting the value to 1 second (default value) tells the system to save the combined results produced by aggregation agents each second.

The default setting is 1 second:

RequestResponse
<MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>

Aggregator

Replace the standard implementation of the batch aggregator with your own custom implementation.

RequestResponse
<Aggregator>
  <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>
</Aggregator>

Table-valued parameters (TVP)

Sitecore uses table-valued parameters (TVP) to send aggregated information to SQL server.

For information on how to use table-valued parameters read the Microsoft article Use Table-Valued Parameters (Database Engine).

Using TVP provides a faster way of communicating with SQL Server. It enables you to pass a table as a parameter to a function or stored procedure after the data from the collection database has been grouped together. The use of TVP is optional and is implemented separately for each fact and dimension table.

The SQL table mappings in the Sitecore configuration file have the following properties to support TVP:

  • IsMultiRow – shows whether the stored procedure accepts TVP.

  • TableType – the name of the SQL Server user-defined type that is accepted by the stored procedure. This type must correspond to the structure of the table. It must have the same columns in the same order and use the same types. For more information on user-defined types, read the Microsoft article Working with User-Defined Types in SQL Server.

SQL table mappings are located at the following path: /sitecore/aggregation/routines/ExecRoutineStatementBuilder/mappings and defined in the Sitecore.Analytics.Processing.Aggregation.config file.

It is recommended that you create custom aggregations that support for TVP. See the example for a description of how to configure Sitecore to do this.

Configuration patch file examples

You can customize the following examples to change your default batch aggregation configuration.

Example 1: Changing the default configuration settings

This is an example of a configuration patch file that you can use to change the default batch aggregation configuration file settings. It shows how to change the MaximumBatchSize, MultiplexingTimeout, Interval and MaxThreads parameters for both the live and history aggregators.

RequestResponse
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <aggregation>
      <!-- Configure the aggregator agent: -->
      <aggregator>
        <Aggregator>
          <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>
        </Aggregator>
        <MaximumBatchSize>128</MaximumBatchSize>
      </aggregator>
      <!-- Configure the historyWorker agent: -->
      <historyWorker>
        <Aggregator>
          <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout>
        </Aggregator>
        <MaximumBatchSize>128</MaximumBatchSize>
      </historyWorker>
      <!-- Aggregation Module: -->
      <module type="Sitecore.Analytics.Aggregation.AggregationModule">
        <BackgroundServices>
          <aggregator>
            <Interval>0.00:00:15</Interval>
            <MaxThreads>1</MaxThreads>
          </aggregator>
          <history>
            <Interval>0.00:00:15</Interval>
            <MaxThreads>1</MaxThreads>
          </history>
        </BackgroundServices>
      </module>
  </sitecore>
</configuration>

Example 2: Enabling TVP support for a custom aggregation

This is a code sample that you can use to create your own configuration file to enable a custom aggregation to work with batch aggregation. It shows how to integrate a custom processor into the aggregation pipeline for interactions and how to configure mappings for a custom fact table with support for SQL Server TVP.

RequestResponse
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <aggregation>
      <routines>
        <ExecRoutineStatementBuilder>
          <mappings>
            <!-- Reference to the fact table and associated stored procedure and table type -->
            <SqlMappingEntity uid="MyCustomFact" 
               type="Sitecore.Analytics.Aggregation.SqlMappingEntity, Sitecore.Analytics.Sql">
              <Table>Fact_CustomFact</Table>
              <Routine>Add_CustomFact_Tvp</Routine>
              <IsMultiRow>True</IsMultiRow>
              <TableType>CustomFact_Type</TableType>
            </SqlMappingEntity>
          </mappings>
        </ExecRoutineStatementBuilder>
      </routines>
    </aggregation>
    <pipelines>
      <group groupName="analytics.aggregation">
        <pipelines>
          <interactions>
            <!-- Registration of the processor that will write to the new fact table -->
            <processor uid="MyCustomFact" type="Customization.CustomProcessor, Customization" />
          </interactions>
        </pipelines>
      </group>
    </pipelines>
  </sitecore>
</configuration>

Batch aggregation performance counters

If you want to adjust the performance of your batch aggregation processing configuration, or you want to monitor and troubleshoot the aggregation process, you can use the following counters to measure performance.

Counter

Description

Average batch size – live

Displays the average number of interactions contained in each batch processed during live aggregation.

Average batch size – history

Displays the average number of interactions contained in each batch processed during history aggregation. Under normal operation, this counter is equal to MaxBatchSize for most of the time while the history aggregation is active.

Number of batches containing failing items – live

Displays the number of live aggregation batches (per second) containing interactions that have failed to be aggregated or saved to the reporting database. Under normal operation, this counter is always zero.

Number of batches containing failing items – history

Displays the number of history aggregation batches (per second) containing interactions that have failed to be aggregated or saved to the reporting database. Under normal operation, this counter is always zero.

Average write time – live

Displays the average time (in milliseconds) that it takes during live aggregation for a batch to be saved to the reporting database. A number larger than the configured multiplexing timeout indicates a bottleneck.

Do you have some feedback for us?

If you have suggestions for improving this article,