Configuring the xConnect Search Indexer

Abstract

An overview of Search Indexer configuration settings, including settings for different search collection providers.

This topic describes the configuration settings that affect the xConnect Search Indexer. The xConnect Search Indexer has six XML configuration files. Going from generic configuration to provider-specific configuration, these are: sc.Xconnect.SearchIndex.xml, sc.Xdb.Collection.IndexerSettings.xml, sc.Xdb.Collection.Data.Sql.xml, sc.Xdb.Collection.Data.MongoDb.xml, sc.Xdb.Collection.IndexWriter.AzureSearch.xml, and sc.Xdb.Collection.IndexWriter.SOLR.xml.

The xConnect Search Indexer is affected by the underlying provider’s change tracking retention period configuration.

In a single machine on-prem deployment, indexer configuration is located under under C:\>Path to xConnect>\App_data\jobs\continuous\IndexWorker\App_data\Config.

Important

The xConnect Search Indexer does not use all of the configuration settings available in the \App_data\Config sub-folder. For example, the indexer does not use the settings in sc.Xdb.Collection.RepositorySettings.xml.

The following settings control how often the indexer checks for changes.

File path: C:\<Path to xConnect>\App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\SearchIndexer\sc.XConnect.SearchIndexer.xml

Setting

Description

Frequency

Controls how often the indexer checks for changes.

If the indexer is busy for 0.20 seconds, it waits 0.05 seconds before trying again.

If the indexer is busy for longer than 0.25 seconds, it checks for changes immediately after indexing the previous set.

DelayAfterError

Controls how long the indexer waits before retrying after encountering an error.

DelayAfterRecurringError

Controls how long the indexer waits before retrying after hitting the RecurringErrorThreshold.

If an error occurs, increase this value to reduce number of log entries.

RecurringErrorThreshold

Controls how many times an error can occur before the DelayAfterRecurringError setting takes effect.

The following settings control memory usage during index rebuild.

File path: C:\<Path to xConnect>\App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\SearchIndexer\sc.Xdb.Collection.IndexerSettings.xml

Setting

Description

IncomingDataLagOnCompletion

At the end of an index rebuild, the indexer must catch up with incoming data. When the indexer is behind by less than the value specified by IncomingDataLagOnCompletion, the cores are swapped.

ParallelizationDegree

Controls how many parallel streams of data are processed during an index rebuild. Use together with BatchSize to tune memory usage during an index rebuild.

If the indexer is consuming too much memory, reduce this value.

If the indexer is consuming too much memory, reduce this value.

BatchSize

Controls how many contacts or interactions are loaded at one per parallel stream during an index rebuild. To tune memory usage during an index rebuild, use this together with ParallelizationDegree.

If an index rebuild is consuming too much memory, reduce this value.

If the indexer is not consuming very much memory, you can try increasing this value.

SplitRecordsThreshold

Controls the limit for the number of records loaded into memory (per core) by the indexer at any one time.

If the indexer is using too much memory, decrease this value.

To disable splitting, set this value to 0, a negative value, or remove the element completely.

For more information, read:

The following settings apply if you are using the SQL xDB Collection provider.

File path: C:\<Path to xConnect>\App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\Collection\sc.Xdb.Collection.Data.Sql.xml

Setting

Description

NumberOfChangeVersions

Controls the number of transactions (not records) that the change table returns. For example, 500 transactions might include much more than 500 changes.

To prevent the indexer from falling behind and keeping the number of pending changes low, the default value is high (5000) by default. However, high values increase the indexer's memory requirements.

If the indexer is consuming too much memory, try decreasing the value of SplitRecordsThreshold. Reducing the NumberOfChangeVersions setting does not work well, because only a small amount of data (IDs, facet keys, and types of change) is loaded from the change table.

GetChangesCommandTimeoutInSeconds

Controls the timeout for the get changes command. If you experience timeouts during indexing, increase this value.

CommandTimeoutInSeconds

Controls the timeout for collection provider commands such as GET and SAVE.

AbsoluteExpiration

The SQL provider caches meta-data about sharded cluster configuration. This setting controls the expiration of the cache.

The following settings apply if you are using the MongoDB xDB Collection provider.

File path: C:\<Path to xConnect>\App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\Collection\sc.Xdb.Collection.Data.MongoDb.xml

Setting

Default

Description

ContactIdentifierIndexLockTimeInSeconds

60

The maximum identifier lock time. Release the assigned lock within the given time.

NumberOfRecordChangesToReadInRequest

100

Change tracking puts batch entity changes into a document in a change tracking collection. The provider reads the given number of documents for the change tracking collection in a single batch. This setting influences data reading and changes performance. You can increase the value if you have a significant number of small batches of entity changes.

NumberOfChangesPerBatch

100000

The maximum number of changes returned by the provider upon request

WaitTimeForBatchToCompleteInSeconds

100

The maximum time (seconds) spent on saving single batch changes.

MaximumNumberOfRetriesForReadOperations

6

The maximum number to retry data read operations in case of failure.

DelayBetweenRetriesForReadOperationsInMilliseconds

5000

The delay time (milliseconds) between two database read attempts.

MaximumNumberOfRetriesForInitializationOperations

6

The maximum number to retry provider initialization operations in case of failure.

DelayBetweenRetriesForInitializationOperationsInMilliseconds

5000

The delay time (milliseconds) between two provider initialization operation attempts.

ExpireAfterSeconds

432000

The expireAfterSeconds option for the changes collection's TTL index. Remove the change tracking data after the given time.

The following settings are specific to the Azure Search provider.

File path: C:\&lt;Path to xConnect&gt;\App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\SearchIndexer\sc.Xdb.Collection.IndexWriter.AzureSearch.xml

Setting

Description

MaximumUpdateBatchSize

Controls the number of contacts or interactions that are sent in a single post. Cannot be larger than 1000. If large contacts or interactions lead to rejected posts, decrease this value.

ParallelizationDegree

If you use multiple Azure Search partitions, we recommend that you increase this value. Increasing this value causes higher memory consumption.

We recommend that you monitor memory usage of a busy indexer when changing this value. To monitor the memory usage, use these performance counters. IndexWriteAvgTime and IndexWriteAvgBatchSize.

Note

If you remove this setting from the configuration, the provider falls back to the number of cores available on the host machine.

MaximumRetryDelayMilliseconds

Controls the maximum amount of time the provider waits when Azure Search fails temporarily due to load.

If Azure Search consistently returns a 503 error, try increasing this value. Increasing the value might allow the system time to return related errors.

We recommend that you analyze your Azure Search traffic early and frequently.

RetryCount

Controls how many times the provider retries before reporting an error to the indexer. When the provider exceeds this value, it logs an error. Modify together with MaximumRetryDelayMilliseconds.

MaximumWaitTimeoutMilliseconds

Controls the maximum time the indexer waits for changes to be indexed. If the limit is hit, the indexer retries indexing, and adds the following entry to the logs: Waiting documents in index is failed by timeout. Recurring timeout issues indicate that Azure Search is under heavy load. If this happens, consider using Azure Search partitions or increasing the value of the setting.

DataReplicationTimeoutMilliseconds

If you are using Azure Search replicas, this setting controls the amount of time to wait after initially detecting changes. When this wait time has passed, the indexer signals to the xConnect Search service that changes are available to be searched. The delay allows time for data to propagate across replicas. There is currently no way to determine if data is available in all replicas, and it is therefore difficult to tune this setting.

If you are not using replicas, you can set this to 0.

The following settings are specific to the Solr search provider.

File path: C:\&amp;lt;Path to xConnect&amp;gt;\App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\SearchIndexer\sc.Xdb.Collection.IndexWriter.SOLR.xml

Setting

Description

MaximumUpdateBatchSize

Controls the number of contacts or interactions in a single post.

MaximumDeleteBatchSize

Controls the number of contact or interaction deletes in a single post. Although deletes usually concern IDs, Solr's OR clause limit might also affect deleting of child documents.

MaximumCommitMilliseconds

Controls how fast data is soft committed in Solr with commitWithin.

ParallelizationDegree

If you are using multiple Solr replicas, consider increasing this value. Increasing this value causes higher memory consumption.

We recommend that you monitor memory usage of a busy indexer when changing this value. To monitor the indexer, use the performance counters IndexWriteAvgTime and IndexWriteAvgBatchSize.

Note

If you remove this setting from the configuration, the provider falls back to the number of cores available on the host machine.

MaximumRetryDelayMilliseconds

Controls the maximum amount of time the provider waits when Solr fails temporarily due to load.

If Solr consistently returns a 503 error, consider increasing this value. Increasing the value might allow the system time to return related errors.

RetryCount

Controls how many times the provider retries before reporting an error to the indexer. When the provider hits the retry count, the indexer logs an error. Modify together with MaximumRetryDelayMilliseconds.

Encoding

Controls the encoding for Solr posts. Only utf-8 has been tested.