Platform Administration and Architecture

Sitecore Azure Search overview

Abstract

Introducing Sitecore Azure Search, helpful for Sitecore administrators to read before installation or use.

The Sitecore Azure Search provider integrates the Sitecore Search  engine with the Microsoft Azure Search service. The Microsoft Azure Search service is a part of the Microsoft Azure computing platform, you can read more about the Microsoft Azure Search service on their website.

This topic applies to Sitecore Experience Platform 8.2 Update-1 and later and describes the features and limitations of Azure Search, as well as the unsupported Azure Search features.

The Microsoft Azure Search service provides the following features:

  • Extreme scalability, simplicity, and stability.

  • A highly available infrastructure with 99.95% uptime as a part of the Microsoft Azure service level agreement (SLA).

  • An easy way to scale up and scale down as needed.

The Sitecore Azure Search provider includes the following features:

  • Support for all Sitecore search-driven UIs, including user-typed queries, and faceted searches.

  • Support for the majority of LINQ expressions, to enable rapid development of search-powered applications.

  • Native support for fundamental data types such as numbers and dates in faceting, and range queries.

  • Flexible configuration and precise control over the schema of the indexes.

  • Support for running Sitecore in geo-replicated scenarios.

Note

Sitecore Azure Search behaves slightly differently from the Lucene and Solr search providers; this is important to consider if you are going to switch between search providers. Read more about Sitecore Azure Search limitations and behavioral differences in the Limitations of Azure Search section of this topic.

Sitecore Azure Search is the default provider for Sitecore instances that are deployed using the Sitecore Azure SDK. It supports on premise and IaaS deployments. Follow the instructions in Configure Azure Search to configure Sitecore Azure Search.

Compared with Sitecore Search on  Lucene and Solr, Sitecore Search on Azure Search has several limitations. Refer to the following table for specifics:

Limitation

Description

Automatic tokenization by the Azure Search service of document field values and queries when searching and faceting.

This means that:

  • Substring searches that are limited to a single term, for instance, predicates, .StartsWith(), .EndsWith(), and .Contains(), will match parts of terms, and will match terms that are located in any part of the field value. When multiple terms are passed, each term is searched separately, (this can provide more results than expected).

  • Regular expressions spanning multiple terms (containing spaces) returns 0 results.

  • Multiple terms that are passed to .Wildcard() are interpreted as individual wildcards in a field-scoped query.

  • The facet values are calculated based on individual terms in faceted fields, not on whole field values, when a value contains multiple words, (unlike Lucene and Solr).

Note

This limitation only applies to Sitecore versions 8.2.7 and 9.0.1 or earlier. For later versions, you can only change the behavior by applying a lowercase analyzer to specific fields, for example:

<fieldNames hint="raw:AddFieldByFieldName"> <field fieldName="_fullpath" … cloudAnalyzer="lowercase_keyword" />…

Fields

An Azure Search index can only contain up to 1000 fields.

Fuzzy query semantics

Are different in Azure Search, for example:

  • .Like(pattern, similarity)interprets the similarity parameter as the Damerau-Levenshtein Distance (value between 0 and 2). This is different from the way Lucene implements the similarity parameter in Sitecore.

  • The similarity and slop parameters cannot be combined in the Azure Search Lucene syntax, this means multiple-word fuzzy queries, such as .Like() are always interpreted as a phrase query with a slop.

Joining queries

For example, .GroupJoin(), .SelfJoin(),and other operators that join queries, is not supported and results in an error.

Media indexing

This feature is not supported.

Range queries

Are always expressed as filters, as a result:

  • Combining range queries with Search using the logical operator OR (||) produces an error.

  • Range queries on string fields always operate on the whole field value without tokenization and are case-sensitive.

Same name fields

The Azure Search service has a strong schema, this means for example, that there cannot be such things as fields that have the same name but different types in different documents.

Refer to the following list for features that exist in Azure that are not currently supported by your Sitecore provider:

  • Geospatial data types

  • Indexers

  • Suggestions

  • Highlighters