Sitecore Content Tagging architecture
This topic describes the architecture of the Sitecore Content Tagging feature in Sitecore. This topic contains the following sections:
Overview
The Sitecore Content Tagging feature in Sitecore consists of the following:
-
Providers (
IContentProvider,IDiscoveryProvider,ITaxonomyProvider, andITagger) – contain business logic that performs content tagging operations. -
Configuration services (
IItemContentTaggingProviderSetBuilder,IItemContentTaggingConfigurationService) – enable you to build a combination of providers that provide content tagging operations, based on the configuration. -
Pipelines (
getTaggingConfiguration,tagContent,normalizeContent) – give you extension points to inject custom logic into the content tagging process.
Providers
The process of content tagging consists of four steps. For each step, there is an abstraction:
-
IContentProvider – takes as input objects of type
T(for example, a Sitecore item) and returnsTaggableContentobjects. You can implement a custom version of the IContentProvider. -
IDiscoveryProvider – takes as input
TaggableContentobjects and returnsTagDataobjects. You can implement a custom version of the IDiscoveryProvider. -
ITaxonomyProvider – takes as input
TagDataobjects and returnsTagsobjects. Can also return the parent and/or children of a tag if you have implemented structured taxonomy in the provider. You can implement a custom version of the ITaxonomyProvider. -
ITagger – takes as input an object of generic type
T(for example, a Sitecore item) and a collection ofTagsobjects and assigns tags to the typeTobject. You can implement a custom version of the ITagger.
The following diagram shows the dependencies between all provider types:
Configuration services
You can configure each part of the content tagging process. When a user triggers the tagging process, the getTaggingConfiguration pipeline reads the Sitecore configuration and builds a named set of providers based on the configuration.
-
The
IItemContentTaggingConfigurationServiceservice reads the names of providers that are specified in the content tagging configuration and returns theItemContentTaggingConfigurationobject. -
The
IItemContentTaggingProviderSetBuilderservice uses theItemContentTaggingConfigurationobject to build a set of providers that will be used for content tagging.
Pipelines
The getTaggingConfiguration pipeline reads the configuration name and then builds a provider set for this configuration.
The tagContent pipeline uses a set of providers created by the getTaggingConfiguration pipeline to provide content tagging. The tagContent pipeline consists of the following pipeline processors:
-
RetrieveContent– uses the configured content provider to get taggable content from the context item. -
Normalize– takesTaggableContentobjects and provides some processing in order to normalize content before passing it to theGetTagspipeline processor. -
GetTags– getsTagDataobjects forTaggableContentobjects. Uses the configured discovery provider for tagging. The output is the list ofTagDataobjects related to the input content. -
StoreTags– stores received tags. Uses the configured taxonomy provider. The default implementation will create tags items in the Sitecore tag repository. -
ApplyTags– marks the context item with tags. Adds tag item IDs, created by theStoreTagspipeline processor, to the context item’s Semantics field under the Tagging section of the Item. Uses the configured tagger provider.
The normalizeContent pipeline is a separate pipeline to prepare TaggableContent objects for tagging. It is triggered by the Normalize pipeline processor in the tagContent pipeline.
DLLs
The code for Sitecore Content Tagging is broken down into three DLLs.
The Sitecore.ContentTagging.Core DLL contains abstractions, default implementations, and infrastructural code. You can reference this DLL to run parts of Sitecore Content Tagging. For example, in order to get tags for some text without storing the tags, you can use the IDiscoveryProvider CreateDiscoveryProvider(string providerName) method to instantiate a discovery provider that is registered by name in the config file. You can use the IContentTaggingProviderFactory interface to get an instance of any of the four types of provider by name.
The Sitecore.ContentTagging DLL integrates Sitecore Content Tagging with Sitecore. This DLL contains the infrastructure to run content tagging from the Sitecore UI. It contains extension points (pipelines).
The Sitecore.ContentTagging.OpenCalais DLL implements the discovery provider for Refinitiv Intelligent Tagging Open Calais. This allows Sitecore to use Open Calais for content tagging.
Configuration
The configuration file contains the <contentTagging> section. This contains the following:
-
<providers>contains all registered providers grouped into the following sections:-
<content>aggregates IContentProvider implementations -
<discovery>aggregates IDiscoveryProvider implementations -
<tagger>aggregates ITagger implementations -
<taxonomy>aggregates ITaxonomyProvider implementations
-
-
<configurations>defines different configuration sets using providers defined in the<providers>section.
<contentTagging>
<providers>
<content>
<add name="DefaultContentProvider" type="Sitecore.ContentTagging.Core.Providers.DefaultContentProvider,
Sitecore.ContentTagging.Core" />
</content>
<discovery>
<add name="DefaultDiscoveryProvider" type="Sitecore.ContentTagging.Core.Providers.DummyDiscoveryProvider,
Sitecore.ContentTagging.Core" />
</discovery>
<tagger>
<add name="DefaultTagger" type="Sitecore.ContentTagging.Core.Providers.DefaultTagger,
Sitecore.ContentTagging.Core" />
</tagger>
<taxonomy>
<add name="DefaultTaxonomyProvider" type="Sitecore.ContentTagging.Core.Providers.DefaultTaxonomyProvider,
Sitecore.ContentTagging.Core" />
</taxonomy>
</providers>
<configurations>
<config name="Default">
<content>
<provider name="DefaultContentProvider"/>
</content>
<tagger>
<provider name="DefaultTagger"/>
</tagger>
<taxonomy>
<provider name="DefaultTaxonomyProvider"/>
</taxonomy>
<discovery>
<provider name="DefaultDiscoveryProvider"/>
</discovery>
</config>
</configurations>
</contentTagging>
Video: Sitecore Content Tagging - Architecture
You can watch this video to see the customization and extension points included in the Sitecore Content Tagging feature. The video demonstrates how to configure new providers and configuration sets.