Walkthrough: Configure Azure Cognitive Search

Current version: 10.0
Caution

Azure Cognitive Search will be discontinued in the future and Sitecore will no longer provide support for this service in future releases.

To use Azure Search with Sitecore you must first configure your Azure Search service with your Sitecore instance. Use the procedures in this walkthrough to do so.

Format the connection string

The default connection string name for Cloud Search is cloud.search and it contains the following information:

  • serviceUrl – the HTTPS URL of the search service API (for example, https://dk-test.search.windows.net).

  • apiVersion – follows a date format, for example, 2017-11-11 (see more information on API versions).

  • apiKey – the admin key to the service that you obtain from the Azure management portal.

Note

For Sitecore version 9.1 and later, set the API version to 2017-11-11. For information about other supported versions, refer to the Azure Search compatibility table.

The connection string format is:

RequestResponse
<add name="cloud.search" connectionString="serviceUrl=<url>;apiVersion=<apiVersion>;apiKey=<apiKey>" /> 
Note

Azure Cognitive Search for xConnect does not support this feature.

Geo-replicated scenarios

Sitecore supports a Search service with geo-replicated scenarios. To use this type of scenario:

  1. Create two or more Search service instances.

  2. Add connection strings with a pipe separator (|). If you have two search services, for example, searchservice1 and searchservice2, and you want to use them in a geo-replicated scenario, you must use the following connection string:

    RequestResponse
    <add name="cloud.search" 
    connectionString="serviceUrl=https://searchservice1.search.windows.net;apiVersion=2015-02-28;apiKey=AdminKey1|serviceUrl=https://searchservice2.search.windows.net;apiVersion=2015-02-28;apiKey=AdminKey2" /> 

Advanced scaling scenarios

For advanced scaling scenarios, it is best practice to use a dedicated search service for your index. To set this up:

  1. Add a new connection string (for example, cloud.search.analytics).

  2. Configure the corresponding index with this connection string name:

Connection string name

Specify Azure as your search provider

By default Sitecore is distributed with the Solr search provider enabled. 

  • To use the Azure Search provider, in the web.config file, specify Azure as your search provider.

Rebuild the indexes

You must rebuild the indexes to ensure Sitecore is fully operational.

To rebuild the indexes:

  1. Go to the Sitecore login page (http:// {your_instance}/sitecore/login) and log in with your admin credentials.

  2. On the Sitecore Launchpad, click Control Panel, and click Indexing manager.

  3. To select and rebuild all of the indexes, on the Indexing Manager page, click Select all, Rebuild.

    Note

    Rebuilding the index is a time-consuming operation and can take 15 minutes or more.

    When the indexes are rebuilt, the Sitecore search indexes appear in the Search service window of the Azure Portal.

    Indexing Manager page

Map the Azure field types

To see how to define the different types of mapping between .Net and Sitecore, go to the sitecore\contentsearch\indexConfigurations\defaultCloudIndexConfiguration\CloudTypeMapper node in the App_Config\Sitecore\ContentSearch.Azure\Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config file.

Note

You must define the map element for all custom fields like this:

RequestResponse
<map type="<Field type>" cloudType="<Edm type from list of supported types>" />

Azure Search uses the following Entity Data Model (EDM) field types. To map the field types correctly, refer to the following table:

Field type

Description

Edm.String

Optional: Text that can be tokenized for full-text search, for example, word-breaking, stemming, and so on.

Collection(Edm.String)

A list of strings that can be tokenized for full-text search. There is no upper limit on the number of items in a collection, but remember that the 16 MB upper limit on payload size applies to collections.

Edm.Boolean

Contains true or false values.

Edm.Int32

Contains 32-bit integer values.

Edm.Int64

Contains 64-bit integer values.

Edm.Double

Uses double-precision numeric data.

Edm.DateTimeOffset

The date and time values are represented in the OData V4 format, for example:

yyyy-MM-ddTHH:mm:ss.fffZ

yyyy-MM-ddTHH:mm:ss.fff[+|-]HH:mm.

Note

The precision of DateTime fields is limited to milliseconds. If you upload DateTime values that have submillisecond precision, the returned value is rounded up to milliseconds, for example:

2015-04-15T10:30:09.7552052Z is returned as:

2015-04-15T10:30:09.7550000Z.

To see how to define the different types of mapping between .Net and Sitecore, go to the sitecore\contentsearch\indexConfigurations\defaultCloudIndexConfiguration\CloudTypeMapper node in the App_Config\Sitecore\ContentSearch.Azure\Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config file.

Note

You must define the map element for all custom fields like this:

RequestResponse
<map type="<Field type>" cloudType="<Edm type from list of supported types>" />

Map fields

When listing all of the fields that the components use under the index configuration section, it is best practice that you list them as follows:

  • sitecore\contentSearch\configuration\indexes\index\configuration\fieldMap

The following table describes the supported attributes:

Attribute

Description

boost

Use to give one field more importance than another.

cloudAnalyzer

Use to set up an analyzer for both search and indexing operations.

Note

If you are using the searchAnalyzer and indexAnalyzer attributes, you must specify them as a pair that replace the single cloudAnalyzer attribute.

cloudFieldName

The field name as defined for Cloud. It can only contain letters, numbers, and underscores.

Note

The first character must be a letter.

fieldName

The name of the field, as defined for Solr.

format

You must configure this attribute for DateTime type fields. The supported value is yyyy-MM-ddTHH:mm:ss.fffZ.

 indexAnalyzer

Use to set up an analyzer for indexing operations.

Note

If you use this attribute, you must specify the searchAnalyzer and indexAnalyzer attributes as a pair that replace the single cloudAnalyzer attribute.

searchAnalyzer

Use this attribute to set up an analyzer for search operations.

Note

If you use this attribute, you must specify the searchAnalyzer and indexAnalyzer attributes as a pair that replace the single cloudAnalyzer attribute.

settingType

The settingType property must be: Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure

You must configure query support and facet support correctly. To configure support for Azure Search, use the support reference for Azure Search .

Configure a number of open HTTP connections

You can manage the number of connections that the Azure Search provider uses with the following setting:

RequestResponse
<setting name="ContentSearch.Azure.ServicePoint.ConnectionLimit" value="100" />

In the following scenarios, you must adjust the setting value:

  • If there are other applications that are using HTTP intensively, you must decrease the value.

  • If an application is under a high load, it takes too long to create new connections or creation fails. This is due to the exceed limit being set too low. You must increase the value.

Configure the Token Analyzer

The Token Analyzer divides text into tokens by adding support from the Azure Search Analyzer API. If a search query contains non-Latin characters such as a hyphen ("-") , it separates the text into two tokens. For example, if you search in the Name field with the query test-index, the text in the query is tokenized into two separate tokens: test and index(regardless of whether the field contains any language analyzers).

Query results depend on which analyzer is configured. If a field does not contain any analyzers, the ContentSearch.Azure.DefaultTokenAnalyzer is used to send requests.

Use the following settings to configure analyzers:

Setting

Description

ContentSearch.Azure.UseTokenAnalyzer

Controls whether to send a request to analyze the API. This setting can influence the performance of your environment. However, you can disable it. If the setting is disabled, or a response from Analyze API returns errors, then the text is divided on tokens that include the following symbols:

" " (space)

-

#

@

.

,

`

~

!

$

%

^

&

(

)

<

>

+

/

ContentSearch.Azure.DefaultTokenAnalyzer

Use this to specify the default analyzer that sends requests to analyze API. By default, requests are sent using the standard Lucene analyzer, unless you specify otherwise.

ContentSearch.Azure.TokenCacheSlidingExpiration

Controls how long items can stay in cache. This is important because all Analyze API requests influence the performance of your environment to store the results of Analyze API in cache.

Control fields with whitelisting

You use whitelisting to control which fields are included in the index schema.

To configure Azure Search whitelisting:

  1. In the \App_Config\Sitecore\ContentSearch.Azure\ Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config  configuration file, set the value of the indexAllFields setting  to false.

  2. In the configuration file, add the minimum recommended fields to include in the list in the configuration section. The minimum recommended list of fields includes:

    RequestResponse
    <include hint="list:AddIncludedField">
      <__Boost>{93D1B217-B8F4-462E-BABF-68298C9CE667}</__Boost>
      <__Bucketable>{C9283D9E-7C29-4419-9C28-5A5C8FF53E84}</__Bucketable>
      <__Created_By>{5DD74568-4D4B-44C1-B513-0AF5F4CDA34F}</__Created_By>
      <__Enable_Item_Fallback>{FD4E2050-186C-4375-8B99-E8A85DD7436E}
      </__Enable_Item_Fallback>
      <__Hidden>{39C4902E-9960-4469-AEEF-E878E9C8218F}</__Hidden>
      <__Icon>{06D5295C-ED2F-4A54-9BF2-26228D113318}</__Icon>
      <__Is_Bucket>{D312103C-B36C-4CA5-864A-C85F9ABDA503}</__Is_Bucket>
      <__Semantics>{A14F1B0C-4384-49EC-8790-28A440F3670C}</__Semantics>
      <Date_Range>{7146F1A4-45FB-4CEC-9855-C95E9E595827}</Date_Range>
      <Extension>{C06867FE-9A43-4C7D-B739-48780492D06F}</Extension>
      <Facets_Location>{96ABAC42-67D6-46E0-91A9-8F46FF9EAC25}</Facets_Location>
      <Facets_Template>{154DC6DA-89C4-4704-8CDA-95994D794BEA}</Facets_Template>
      <File_Size>{E344C026-8575-496D-8CDF-7741891D0786}</File_Size>
      <ID>{5A531AF0-C44C-4141-A0D3-09C5CDC3D654}</ID>
      <Image_Dimensions>{05EF282C-54DE-49B5-9EF3-0EB3008080C6}</Image_Dimensions>
      <Language>{BC06ED64-C4A1-4EE2-9835-541E1CC4CCC9}</Language>
      <Mime_Type>{6F47A0A5-9C94-4B48-ABEB-42D38DEF6054}</Mime_Type>
      <Parent_ID>{1F4412CC-609C-4D3C-AF8C-D5C849202916}</Parent_ID>
      <Search_Types_Location>{A34F2EFE-CF7A-4BF3-86DC-589B33C1B236}</Search_Types_Location>
      <Search_Types_Template>{473454F9-5184-4BDD-9A04-0F567641407A}</Search_Types_Template>
      <Sitecore_Items>{BBDD760F-6C5F-467A-83B4-095CC9CFC2DF}</Sitecore_Items>
      <Size>{6954B7C7-2487-423F-8600-436CB3B6DC0E}</Size>
      <Table_View>{68DA2D37-ABC0-4001-BF01-A3FC8D2F1BF9}</Table_View>
      <Tag>{FE6DB0A6-09BD-4FEB-8D82-0F1C14183E18}</Tag>
      <Tags>{56EF9816-35AD-4160-B5DC-ECA7FE7DCFC2}</Tags>
      <Text>{A60ACD61-A6DB-4182-8329-C957982CEC74}</Text>
      <Title>{75577384-3C97-45DA-A847-81B00500E250}</Title>
      <Search_Types_Text>{E600C190-3F61-4776-B2F5-03AD7AEB351C}</Search_Types_Text>
      <Updated_Date>{87A830FB-4E2F-4F76-896B-F20CFA2374DD}</Updated_Date>
      <Workflow_State>{49D86313-493D-4054-ACC9-D68AD6B09332}</Workflow_State>
    </include>
  3. Extend the configuration section with the fields you want to index.

Do you have some feedback for us?

If you have suggestions for improving this article,