Configuring locale extractors

In Sitecore Search, when you configure a source to crawl and index localized content, you must add the locale to the metadata of each index document. To do this, configure a locale extractor.

Configure the following settings to define how the crawler extracts the locale information from a content item:

Settings

Description

Name

A meaningful name for the locale extractor.

Extractor Type

The type of locale extractor you want to use. You can use:

  • URL - use this locale extractor when you want to use a regular expression to extract the locale from the URL of each page.

  • Header - use this locale extractor when you want to extract the locale from the header.

  • JS - use this extractor when you want to use as JavaScript function to extract the locale from the URL of each page.

URLs to Match

This is an optional setting.

JavaScript locale extractor

Add a JS function to extract locales from each page.

Header locale extractor

Add the header key whose value you want to use as locale. If the advanced web crawler does not find this key in the request header, it looks for it in the response header.

For example, if you add Accept-Language as the header, the crawler looks for the key Accept-Language and uses the value as the locale for that document. If the request header is Accept-Language: es-ES, the index document has metadata that tags it as an es-ES (Spain, Spanish) document.

Do you have some feedback for us?

If you have suggestions for improving this article,