Sitecore Azure Search overview

Current version: 9.0
Caution

Azure Cognitive Search will be discontinued in the future and Sitecore will no longer provide support for this service in future releases.

The Sitecore Azure Search provider integrates the Sitecore Search  engine with the Microsoft Azure Cognitive Search service. The Microsoft Azure Cognitive Search service is a part of the Microsoft Azure computing platform, you can read more about the Microsoft Azure Cognitive Search service on their website. This topic applies to Sitecore Experience Platform 8.2 Update-1 and later and describes the features and limitations of Azure Cognitive Search, as well as the unsupported Azure Cognitive Search features.

The Microsoft Azure Cognitive Search service provides the following features:

  • Extreme scalability, simplicity, and stability.

  • A highly available infrastructure that has 99.95% uptime as a part of the Microsoft Azure service level agreement (SLA).

  • An easy way to scale up and scale down as needed.

The Sitecore Azure Search provider includes the following features:

  • Support for all Sitecore search-driven UIs, including user-typed queries, and faceted searches.

  • Support for the majority of LINQ expressions, to enable rapid development of search-powered applications.

  • Native support for fundamental data types such as numbers and dates in faceting, and range queries.

  • Flexible configuration and precise control over the schema of the indexes.

  • Support for running Sitecore in geo-replicated scenarios.

Note

Sitecore Azure Search behaves slightly differently from the Lucene and Solr search providers; this is important to consider if you are going to switch between search providers. Read more about Sitecore Azure Search limitations and behavioral differences in the Limitations of Azure Cognitive Search section of this topic.

Sitecore Azure Search is the default provider for Sitecore instances that are deployed using the Sitecore Azure SDK. It supports on premise and IaaS deployments. Follow the instructions in Walkthrough: Configure Azure Cognitive Search to configure Sitecore Azure Search.

Compared with Sitecore Search on  Lucene and Solr, Sitecore Search on Azure Cognitive Search has several limitations. Refer to the following table for specifics:

Limitation

Description

Automatic tokenization by the Azure Cognitive Search service of document field values and queries when searching and faceting.

This means that:

  • Substring searches that are limited to a single term, for instance, predicates, .StartsWith(), .EndsWith(), and .Contains(), will match parts of terms, and will match terms that are located in any part of the field value. When multiple terms are passed, each term is searched separately, (this can provide more results than expected).

  • Regular expressions spanning multiple terms (containing spaces) returns 0 results.

  • Multiple terms that are passed to .Wildcard() are interpreted as individual wildcards in a field-scoped query.

  • The facet values are calculated based on individual terms in faceted fields, not on whole field values, when a value contains multiple words, (unlike Lucene and Solr).

Note

This limitation only applies to Sitecore versions 8.2.7 and 9.0.1 or earlier. For later versions, you can only change the behavior by applying a lowercase analyzer to specific fields, for example:

RequestResponse
<fieldNames hint="raw:AddFieldByFieldName"> <field fieldName="_fullpath" … cloudAnalyzer="lowercase_keyword" />…

Fields

An Azure Cognitive Search index can only contain up to 1000 fields.

Every additional language that you add for Culture Support multiplies the number of required fields in the Azure Cognitive Search index. Therefore, by using Azure Search Culture Support for large scale, multi-language solutions, you will quickly reach the 1000 fields limit.

Fuzzy query semantics

Are different in Azure Cognitive Search, for example:

  • .Like(pattern, similarity)interprets the similarity parameter as the Damerau-Levenshtein Distance (value between 0 and 2). This is different from the way Lucene implements the similarity parameter in Sitecore.

  • The similarity and slop parameters cannot be combined in the Azure Cognitive Search Lucene syntax, this means multiple-word fuzzy queries, such as .Like() are always interpreted as a phrase query with a slop.

Joining queries

Such as: .GroupJoin(), .SelfJoin(), and other operators that join queries, is not supported and results in an error.

Media indexing

This feature is not supported.

Range queries

Are always expressed as filters, as a result:

  • Combining range queries with Search using the logical operator OR (||) produces an error.

  • Range queries on string fields always operate on the whole field value without tokenization and are case-sensitive.

Same name fields

The Azure Cognitive Search service has a strong schema, this means for example, that there cannot be such things as fields that have the same name but different types in different documents.

Unsupported Azure Cognitive Search features

Refer to the following list for features that exist in Azure that are not currently supported by your Sitecore provider:

  • Geospatial data types

  • Indexers

  • Suggestions

  • Highlighters

Do you have some feedback for us?

If you have suggestions for improving this article,