Walkthrough: Configure Azure Cognitive Search

Abstract

Configure Sitecore to use Azure Cognitive Search.

Caution

Azure Cognitive Search will be discontinued in the future and Sitecore will no longer provide support for this service in future releases.

To use Azure Search with Sitecore you must first configure your Azure Search service with your Sitecore instance. 

The default connection string name for Cloud Search is cloud.search and it contains the following information:

  • serviceUrl – the HTTPS URL of the search service API (for example, https://dk-test.search.windows.net).

  • apiVersion – follows a date format, for example, 2017-11-11 (see more information on API versions).

  • apiKey – the admin key to the service that you obtain from the Azure management portal.

Note

For Sitecore version 9.1 and later, set the API version to 2017-11-11. For information about other supported versions, refer to the Azure Search compatibility table.

The connection string format is:

<add name="cloud.search" connectionString="serviceUrl=<url>;apiVersion=<apiVersion>;apiKey=<apiKey>" /> 

Note

Azure Cognitive Search for xConnect does not support this feature.

Geo-replicated scenarios

Sitecore supports a Search service with geo-replicated scenarios. To use this type of scenario:

  1. Create two or more Search service instances.

  2. Add connection strings with a pipe separator (|). If you have two search services, for example, searchservice1 and searchservice2, and you want to use them in a geo-replicated scenario, you must use the following connection string:

    <add name="cloud.search" 
    connectionString="serviceUrl=https://searchservice1.search.windows.net;apiVersion=2015-02-28;apiKey=AdminKey1|serviceUrl=https://searchservice2.search.windows.net;apiVersion=2015-02-28;apiKey=AdminKey2" /> 
    

Advanced scaling scenarios

For advanced scaling scenarios, it is best practice to use a dedicated search service for your index. To set this up:

  1. Add a new connection string (for example, cloud.search.analytics).

  2. Configure the corresponding index with this connection string name:

Connection string name

By default Sitecore is distributed with the Solr search provider enabled. 

In order to use Azure instead, you must specify Azure as your search provider in the web.config file.

You must rebuild the indexes to ensure Sitecore is fully operational.

To rebuild the indexes:

  1. Go to the Sitecore login page (http:// {your_instance}/sitecore/login) and log in with your admin credentials.

  2. On the Sitecore Launchpad, click Control Panel, and select Indexing manager.

  3. To select and rebuild all of the indexes, on the Indexing Manager page, click Select all, Rebuild.

    Note

    Rebuilding the index is a time consuming operation and can take 15 minutes or more.

    When the indexes are rebuilt, the Sitecore search indexes appear in the Search service window of the Azure Portal.

    Indexing Manager page

Azure Search uses the following Entity Data Model (EDM) field types. Refer to the table to ensure you map the field types correctly.

Field type

Description

Edm.String

Optional: Text that can be tokenized for full-text search, for example, word-breaking, stemming, and so on.

Collection(Edm.String)

A list of strings that can be tokenized for full-text search. There is no upper limit on the number of items in a collection, but remember that the 16 MB upper limit on payload size applies to collections.

Edm.Boolean

Contains true or false values.

Edm.Int32

Contains 32-bit integer values.

Edm.Int64

Contains 64-bit integer values.

Edm.Double

Uses double-precision numeric data.

Edm.DateTimeOffset

The date and time values are represented in the OData V4 format, for example:

yyyy-MM-ddTHH:mm:ss.fffZ

yyyy-MM-ddTHH:mm:ss.fff[+|-]HH:mm.

Note

The precision of DateTime fields is limited to milliseconds. If you upload DateTime values that have sub-millisecond precision, the returned value will be rounded up to milliseconds, for example:

2015-04-15T10:30:09.7552052Z will be returned as:

2015-04-15T10:30:09.7550000Z.

To see how to define the different types of mapping between .Net and Sitecore, go to the sitecore\contentsearch\indexConfigurations\defaultCloudIndexConfiguration\CloudTypeMapper node in the App_Config\Sitecore\ContentSearch.Azure\Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config file.

Note

You must define the map element for all custom fields like this:

<map type="<Field type>" cloudType="<Edm type from list of supported types>" />

When listing all of the fields that the components use under the index configuration section, it is best practice that you list them as follows:

  • sitecore\contentSearch\configuration\indexes\index\configuration\fieldMap

The supported attributes are:

Attribute

Description

boost

Use the boost attribute to give one field more importance than another.

cloudAnalyzer

Use this attribute to set up an analyzer for both search and indexing operations.

Note

If you are using the searchAnalyzer and indexAnalyzer attributes, you must specify them as a pair that replace the single cloudAnalyzer attribute.

cloudFieldName

The field name as defined for Cloud. It can only contain letters, numbers, and underscores.

Note

The first character must be a letter.

fieldName

The name of the field, as defined for Solr.

format

You must configure this attribute for DateTime type fields. The supported value is yyyy-MM-ddTHH:mm:ss.fffZ.

 indexAnalyzer

Use this attribute to set up an analyzer for indexing operations.

Note

If you use this attribute, you must specify the searchAnalyzer and indexAnalyzer attributes as a pair that replace the single cloudAnalyzer attribute.

searchAnalyzer

Use this attribute to set up an analyzer for search operations.

Note

If you use this attribute, you must specify the searchAnalyzer and indexAnalyzer attributes as a pair that replace the single cloudAnalyzer attribute.

settingType

The settingType property must be: Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure

Use the support reference for Azure Search to ensure that you configure query support and facet support correctly.

You can manage the number of connections that the Azure Search provider uses with the following setting:

<setting name="ContentSearch.Azure.ServicePoint.ConnectionLimit" value="100" />

In the following scenarios you must adjust the setting value:

  • Decrease the value - If there are other applications that are using HTTP intensively.

  • Increase the value - If an application is under a high load, it takes too long to create new connections or creation fails. This is due to the exceed limit being set too low.

The token Analyzer divides text into tokens by adding support from the Azure Search Analyzer API. If a search query contains non-Latin characters such as a hyphen ("-") , it separates the text into two tokens. For example, if you search in the Name field with the query test-index, the text in the query is tokenized into two separate tokens: test and index (regardless of whether the field contains any language analyzers).

Query results depend on which analyzer is configured. If a field does not contain any analyzers, the ContentSearch.Azure.DefaultTokenAnalyzer will be used to send requests.

Use the following settings to configure analyzers:

Setting

Description

ContentSearch.Azure.UseTokenAnalyzer

Controls whether to send a request to analyze the API. This setting can influence the performance of your environment. However, you can disable it. If the setting is disabled, or a response from Analyze API returns errors, then the text will be divided on tokens that include the following symbols:

" " (space)

-

#

@

.

,

`

~

!

$

%

^

&

(

)

<

>

+

/

ContentSearch.Azure.DefaultTokenAnalyzer

Use this to specify the default analyzer that will send requests to analyze API. By default, requests are sent using the standard Lucene analyzer, unless you specify otherwise.

ContentSearch.Azure.TokenCacheSlidingExpiration

Controls how long items can stay in cache. This is important because all Analyze API requests influence the performance of your environment to store the results of Analyze API in cache.

You use whitelisting to control which fields are included in the index schema.

To configure Azure Search whitelisting:

  1. In the \App_Config\Sitecore\ContentSearch.Azure\ Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config  configuration file, set the value of the indexAllFields setting  to false.

  2. In the configuration file, add the minimal recommended fields to include to list in the configuration section. The minimal recommended list of fields includes:

    <include hint="list:AddIncludedField">
      <__Boost>{93D1B217-B8F4-462E-BABF-68298C9CE667}</__Boost>
      <__Bucketable>{C9283D9E-7C29-4419-9C28-5A5C8FF53E84}</__Bucketable>
      <__Created_By>{5DD74568-4D4B-44C1-B513-0AF5F4CDA34F}</__Created_By>
      <__Enable_Item_Fallback>{FD4E2050-186C-4375-8B99-E8A85DD7436E}
      </__Enable_Item_Fallback>
      <__Hidden>{39C4902E-9960-4469-AEEF-E878E9C8218F}</__Hidden>
      <__Icon>{06D5295C-ED2F-4A54-9BF2-26228D113318}</__Icon>
      <__Is_Bucket>{D312103C-B36C-4CA5-864A-C85F9ABDA503}</__Is_Bucket>
      <__Semantics>{A14F1B0C-4384-49EC-8790-28A440F3670C}</__Semantics>
      <Date_Range>{7146F1A4-45FB-4CEC-9855-C95E9E595827}</Date_Range>
      <Extension>{C06867FE-9A43-4C7D-B739-48780492D06F}</Extension>
      <Facets_Location>{96ABAC42-67D6-46E0-91A9-8F46FF9EAC25}</Facets_Location>
      <Facets_Template>{154DC6DA-89C4-4704-8CDA-95994D794BEA}</Facets_Template>
      <File_Size>{E344C026-8575-496D-8CDF-7741891D0786}</File_Size>
      <ID>{5A531AF0-C44C-4141-A0D3-09C5CDC3D654}</ID>
      <Image_Dimensions>{05EF282C-54DE-49B5-9EF3-0EB3008080C6}</Image_Dimensions>
      <Language>{BC06ED64-C4A1-4EE2-9835-541E1CC4CCC9}</Language>
      <Mime_Type>{6F47A0A5-9C94-4B48-ABEB-42D38DEF6054}</Mime_Type>
      <Parent_ID>{1F4412CC-609C-4D3C-AF8C-D5C849202916}</Parent_ID>
      <Search_Types_Location>{A34F2EFE-CF7A-4BF3-86DC-589B33C1B236}</Search_Types_Location>
      <Search_Types_Template>{473454F9-5184-4BDD-9A04-0F567641407A}</Search_Types_Template>
      <Sitecore_Items>{BBDD760F-6C5F-467A-83B4-095CC9CFC2DF}</Sitecore_Items>
      <Size>{6954B7C7-2487-423F-8600-436CB3B6DC0E}</Size>
      <Table_View>{68DA2D37-ABC0-4001-BF01-A3FC8D2F1BF9}</Table_View>
      <Tag>{FE6DB0A6-09BD-4FEB-8D82-0F1C14183E18}</Tag>
      <Tags>{56EF9816-35AD-4160-B5DC-ECA7FE7DCFC2}</Tags>
      <Text>{A60ACD61-A6DB-4182-8329-C957982CEC74}</Text>
      <Title>{75577384-3C97-45DA-A847-81B00500E250}</Title>
      <Search_Types_Text>{E600C190-3F61-4776-B2F5-03AD7AEB351C}</Search_Types_Text>
      <Updated_Date>{87A830FB-4E2F-4F76-896B-F20CFA2374DD}</Updated_Date>
      <Workflow_State>{49D86313-493D-4054-ACC9-D68AD6B09332}</Workflow_State>
    </include>
  3. Extend the configuration section with the fields you want to index.