Index update strategies
You use index update strategies to maintain indexes. You can configure each index with a unique set of index update strategies. We recommend that you do not specify more than three update strategies per index for performance reasons. Indexes can still also be updated manually, too, for example from the Indexing Manager, or from custom code.
Sitecore provides a varied set of index update strategies, and you can extend this set with more strategies. All the strategies that are delivered with Sitecore are defined under the following node in the Sitecore.ContentSearch
configuration files:
sitecore/contentSearch/indexConfigurations/indexUpdateStrategies
<manual type="Sitecore.ContentSearch.Maintenance.Strategies.ManualStrategy,Sitecore.ContentSearch" />
Sitecore comes with the following strategies:
Some of these strategies use the CrawlingLog
file. To enable messages in the CrawlingLog
file, you must use a patch file to enable the DEBUG
level in the Sitecore.Diagnostics.Crawling
logger. For example:
<logger name="Sitecore.Diagnostics.Crawling" additivity="false">
<level value="DEBUG"/>
<appender-ref ref="CrawlingLogFileAppender"/>
</logger>
RebuildAfterFullPublish strategy
This strategy is defined in the following way in the configuration file:
<rebuildAfterFullPublish type="Sitecore.ContentSearch.Maintenance.Strategies.RebuildAfterFullPublishStrategy,Sitecore.ContentSearch" />
During initialization, this strategy subscribes to the OnFullPublishEnd
event and it triggers a full index rebuild.
In a distributed environment, the index rebuild is triggered on all remote servers where this strategy is configured. In this case, you must enable the event queue.
In environments where a full publish is required to run regularly, we recommend that you do not trigger incremental index rebuilds because this uses a lot of resources. Instead, this strategy triggers a full index rebuild when a full publish process has completed.
When you attach this strategy to an index, you see the following message in the CrawlingLog
file when it is initialized:
Initializing RebuildAfterFullPublishStrategy for index '<index_name>'
When this strategy is triggered, you see the following message in the CrawlingLog
file:
RebuildAfterFullPublishStrategy triggered on index '<index_name>'
Attaching the RebuildAfterFullPublish strategy to an index
Attach this strategy to an index in the following way:
<index id="sitecore_index" type="Sitecore.ContentSearch.SolrProvider.
SolrIndex, Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="folder">$(id)</param>
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/rebuildAfterFullPublish" />
</strategies>
<Analyzer ref="search/analyzer" />
Best practice
You must not combine this strategy with the Synchronous Strategy, but you can combine it with any of the other strategies.
Because this strategy causes a full index rebuild, you must combine it with SwitchOnRebuildSolrSearchIndex.
OnPublishEndAsync strategy
This strategy is defined in the following way in the configuration file:
<onPublishEndAsync type="Sitecore.ContentSearch.Maintenance.Strategies.
OnPublishEndAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">web</param>
<CheckForThreshold>true</CheckForThreshold>
</onPublishEndAsync>
During initialization, this strategy subscribes to the OnPublishEnd
event and triggers an incremental index rebuild.
If you have separate CM and CD servers, this event is triggered via the EventQueue
object. This means that you must enable the EventQueue
object for this strategy to work in this kind of environment.
There is an additional database
parameter that is passed to the constructor of the OnPublishEndAsynchronousStrategy
class. This parameter defines the database to look up the item changes from.
When you attach this strategy to an index and it is initialized, you see the following message in the CrawlingLog
file:
Initializing OnPublishEndAsynchronousStrategy for index '<index_name>'.
When this strategy is triggered, you see the following message in the CrawlingLog
file:
"<index_name> OnPublishEndAsynchronousStrategy executing."
Processing
This strategy uses the EventQueue
object from the database it was initialized with:
<param desc="database">web</param>
This means that this strategy depends on a number of things:
-
This database must be specified in the
<databases />
section of the configuration file. -
The
EnableEventQueues
setting must betrue
. -
The
EventQueue
table within the preconfigured database must have entries that are dated later than the last update timestamp of the index.
If the number of unprocessed events related to item changes exceeds a threshold, then a full index rebuild is triggered instead of an incremental update.
Events related to item changes are:
-
RemovedVersionRemoteEvent
-
SavedItemRemoteEvent
-
DeletedItemRemoteEvent
-
MovedItemRemoteEvent
-
AddedVersionRemoteEvent
-
CopiedItemRemoteEvent
-
RestoreItemCompletedEvent
Unprocessed events are item change events that have a stamp value higher than the value of the LAST_UPDATED_TIMESTAMP property. This property is stored in the system properties table that is determined by the defaultStore
attribute of the PropertyStoreProvider
setting. The property is unique per search index and instance, for example: CORE_SITECORE_MASTER_INDEX_MyMachineName-MySite.local_LAST_UPDATED_TIMESTAMP
.
The threshold value is set by the ContentSearch.FullRebuildItemCountThreshold
setting and is shared by all index update strategies. The setting is hidden: it is not available in the configuration, but you can add it manually. The default value of the setting is 100,000.
The optimal value for the threshold depends on:
-
The total number of documents in a search index. For example, if a search index contains 50,000 documents then the threshold value can be set to 25,000.
-
The ratio between add and remove operations (an update is equivalent to remove and then add). You can lower the threshold if remove operations are more frequent than add or update operations.
If there are many operations, consider whether it is faster to build the index from scratch (using add operations) than to process all the delete, add, and update operations separately.
The check for the threshold value can be disabled for each strategy: <CheckForThreshold>false<CheckForThreshold>
. If you set this setting to true
, we recommend that you also use the SwitchOnRebuildSolrSearchIndex
implementation for any index that uses this strategy.
The value of the ContentSearch.FullRebuildItemCountThreshold
setting has a default of 100,000.
Attaching the OnPublishEndAsync strategy to an index
Attach this strategy to an index in the following way:
<index id="sitecore_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex,
Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="core">$(id)</param>
<param desc="propertyStore"
ref="contentSearch/indexConfigurations/databasePropertyStore"
param1="$(id)" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
</strategies>
Best practice
Do not combine this strategy with any of these strategies:
-
Synchronous
-
IntervalAsynchronous
-
OnPublishEndAsyncSingleInstance
You can combine it with these strategies:
-
RebuildAfterFullPublish
-
RemoteRebuild
You can use this strategy for multiserver/multi-instance environments, where you have already enabled the EventQueue.
OnPublishEndAsyncSingleInstance strategy
This strategy is defined in the following way in the configuration file:
<onPublishEndAsyncSingleInstance type="Sitecore.ContentSearch.Maintenance.Strategies.OnPublishEndAsynchronousSingleInstanceStrategy, Sitecore.ContentSearch" singleInstance="true">
<param desc="database">web</param>
<CheckForThreshold>true</CheckForThreshold>
</onPublishEndAsyncSingleInstance>
Processing
Like the OnPublishEndAsync
strategy, this strategy is triggered by the OnPublishEnd
event. It launches an incremental update operation for item modifications as determined by the event queue.
The key difference between these strategies is that when the OnPublishEndAsyncSingleInstance
strategy is triggered, it retrieves the event records only once and reuses them for all indexes it is attached to, while the OnPublishEndAsync
strategy retrieves the records individually for each index.
This different behavior reduces the load on the database and decreases the resource consumption by the Sitecore instance.
Attaching the OnPublishEndAsyncSingleInstance strategy to an index
You attach this strategy to an index in the following way:
<index id="sitecore_web_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
...
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsyncSingleInstance" />
</strategies>
...
</index>
Best practice
We recommend that you use the onPublishEndAsyncSingleInstance
strategy if you have multiple indexes that cover the same database with published content.
Do not combine this strategy with any of these strategies:
-
Synchronous
-
IntervalAsynchronous
-
OnPublishEndAsync
You can combine it with these strategies:
-
RebuildAfterFullPublish
-
RemoteRebuild
IntervalAsynchronous strategy
This strategy is defined in the following way in the configuration file:
<intervalAsyncMaster type="Sitecore.ContentSearch.Maintenance.Strategies.
IntervalAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">master</param>
<param desc="interval">00:00:10</param>
<CheckForThreshold>true</CheckForThreshold>
</intervalAsyncMaster>
-
You specify the database to look up item changes for the processing from with the
database
parameter. -
You specify the frequency of the strategy trigger with the
interval
parameter.
When you attach this strategy to an index and it is initialized, you can see the following message in the CrawlingLog
file:
Initializing IntervalAsynchronousUpdateStrategy for index '<index_name>'.
When this strategy is triggered, you can see the following message in the CrawlingLog
file:
IntervalAsynchronousUpdateStrategy triggered on index '<index_name>'
Processing
This strategy is triggered by a time interval and not the OnPublishEnd
event. It uses the EventQueue
table of the source database. The source
database is specified by the database
parameter of the strategy. For example:
<param desc="database">web</param>
The preconditions for using this strategy are:
-
The
EnableEventQueues
setting must betrue
. -
The referenced database must be defined in the
<databases>
configuration section. -
The referenced database must match at least one database that is defined in a search index to be crawled.
The strategy uses an internal timer that is initialized with a predefined interval value. The strategy is triggered when the timer fires. In this example, the timer is set to fire every 10 seconds:
<intervalAsync type="Sitecore.ContentSearch.Maintenance.Strategies.
IntervalAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">web</param>
<param desc="interval">00:00:10</param>
<CheckForThreshold>true</CheckForThreshold>
</intervalAsync>
The threshold value is set by the ContentSearch.FullRebuildItemCountThreshold
setting and is shared by all index update strategies. The setting is hidden: it is not available in the configuration, but you can add it manually. The default value of the setting is 100,000.
The optimal value for the threshold depends on:
-
The total number of documents in a search index. For example, if a search index contains 50,000 documents then the threshold value can be set to 25,000.
-
The ratio between add and remove operations (an update is equivalent to remove and then add). You can lower the threshold if remove operations are more frequent than add or update operations.
If there are many operations, consider whether it is faster to build the index from scratch (using add operations) than to process all the delete, add, and update operations separately.
The check for the threshold value can be disabled for each strategy: <CheckForThreshold>false<CheckForThreshold>
. If you set this setting to true
, we recommend that you also use the SwitchOnRebuildSolrSearchIndex
implementation for any index that uses this strategy.
The ContentSearch.FullRebuildItemCountThreshold
setting is not enabled in the configuration files that Sitecore delivers. It defaults to 100,000.
Attaching the IntervalAsynchronous strategy to an index
Attach this strategy to an index in the following way:
<index id="sitecore_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex,
Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="core">$(id)</param>
<param desc="propertyStore"
ref="contentSearch/indexConfigurations/databasePropertyStore"
param1="$(id)" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/intervalAsync" />
</strategies>
Best practice
Do not combine this strategy with these strategies:
-
SynchronousStrategy
-
OnPublishEndAsync
-
OnPublishEndAsyncSingleInstance
You can combine it with these strategies:
-
RebuildAfterFullPublish
-
RemoteRebuild
We recommend that you use this strategy for the master database indexes and for single-server environments where you want to use as few resources as possible.
This strategy is also useful for less critical indexes that you do not need to update frequently. You can adjust the interval to fit your needs.
This strategy is created for the core and master databases in the setup that Sitecore delivers:
<intervalAsyncCore type="Sitecore.ContentSearch.Maintenance.Strategies.
IntervalAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">core</param>
<param desc="interval">00:01:00</param>
<CheckForThreshold>true</CheckForThreshold>
</intervalAsyncCore>
<intervalAsyncMaster type="Sitecore.ContentSearch.Maintenance.Strategies.
IntervalAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">master</param>
<param desc="interval">00:00:10</param>
<CheckForThreshold>true</CheckForThreshold>
</intervalAsyncMaster>
Synchronous strategy
This strategy is the index update strategy closest to real-time. It is also the most expensive strategy in terms of CPU and I/O.
Before you use this strategy, you must be familiar with the best practices.
You specify this strategy in the following way:
<sync type="Sitecore.ContentSearch.Maintenance.Strategies.SynchronousStrategy, Sitecore.ContentSearch" />
When you attach this strategy to an index and it is initialized, you see the following message in the CrawlingLog
file:
Initializing SynchronousStrategy for index '<index_name>'.
When this strategy is triggered, you see this message in the CrawlingLog
file:
SynchronousStrategy
triggered on index '<index_name>'
Processing
This strategy subscribes to low-level DataEngine events, such as ItemSaved
and ItemSavedRemote
. When you use it on a single-server instance, it guarantees an index update immediately after an item update.
In a multiserver environment, the strategy uses the EventQueue that broadcasts remote ItemSavedRemote
events. When an item is published and the ItemSavedRemote
event is raised, the strategy is triggered.
Attaching the Synchronous strategy to an index
Attach this strategy to an index in the following way:
<index id="sitecore_index" type="Sitecore.ContentSearch.SolrProvider.SolrIndex,
Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="core">$(id)</param>
<param desc="propertyStore"
ref="contentSearch/indexConfigurations/databasePropertyStore"
param1="$(id)" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/sync" />
</strategies>
Best practice
Use this strategy if you need immediate index updates and you have a dedicated indexing server infrastructure that has plenty of processing resources. Only use the Synchronous strategy on CM servers for the indexes that process the master database and where the timing of the index update is critical.
If you use this strategy on a CM server where many entries are added and changed, it can degrade system performance severely. In most cases, the IntervalAsyncronous strategy configured for the master database is sufficient.
Any changes that occur in the BulkUpdateContext
are not be processed by this strategy and a full index rebuild is required to bring the search index back in sync. If you use BulkUpdateContext
on a regular basis, we recommend that you use asynchronous strategies.
You can only combine this strategy with the following strategy:
-
RemoteRebuild
The strategy has these prerequisites:
-
This strategy does not require the EventQueue to be enabled when the strategy is used on the same instance that the item changes occur on. For example, if your solution only has a single CM instance, the Synchronous strategy can be used to process changes in the master database. However, if you have multiple CM instances, the EventQueue must be enabled to share events across the different instances.
RemoteRebuild strategy
This strategy subscribes to the OnIndexingEndedRemote
event. This event is triggered when a particular index is rebuilt. The strategy is only activated when a full index rebuild takes place.
You use this mechanism to rebuild remote indexes when you force an index rebuild. You specify this strategy like this:
<remoteRebuild type="Sitecore.ContentSearch.Maintenance.Strategies.
RemoteRebuildStrategy, Sitecore.ContentSearch" />
Attaching the RemoteRebuild strategy to an index
Attach this strategy to an index in the following way:
<index id="sitecore_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex,
Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="core">$(id)</param>
<param desc="propertyStore"
ref="contentSearch/indexConfigurations/databasePropertyStore"
param1="$(id)" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/remoteRebuild" />
</strategies>
Best practice
You can combine this strategy with any other strategy. You use it in multiserver environments, where each Sitecore instance maintains its own copy of the index. You can then trigger a full rebuild from one CM server, and all remote servers where the index is configured with this strategy will rebuild.
The strategy has these prerequisites:
-
The name of the index on the remote server must be identical to the name of the index that you forced to rebuild.
-
You must enable the EventQueue.
-
The database you assign for system event queue storage (
core
by default) must be shared between the Sitecore instance where the rebuild takes place and the other instances.
Manual strategy
This strategy disables any automatic index updates. When you use this strategy for an index, you must rebuild this index manually.
You specify this strategy like this:
<manual type="Sitecore.ContentSearch.Maintenance.Strategies.ManualStrategy,
Sitecore.ContentSearch" />
When you attach this strategy to an index and it is initialized, you see the following message in the CrawlingLog
file:
Initializing ManualStrategy for index '<index_name>'.
Index will have to be rebuilt manually
Attaching the Manual strategy to an index
Attach this strategy to an index in the following way:
<index id="sitecore_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex,
Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="core">$(id)</param>
<param desc="propertyStore"
ref="contentSearch/indexConfigurations/databasePropertyStore"
param1="$(id)" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/manual" />
</strategies>
Best practice
Do not combine this strategy with any other strategy. It is reserved for special situations where you have to outsource the whole indexing process to a dedicated server and you do not want any index updates on other Sitecore instances.