Use locale extractors for localized content
You can index localized content with some Sitecore Search sources. To index localized content, you extract the content item's locale, like English (US) (en-US) or Japan (Japanese) (ja-JP), and generate a common ID for localized versions of the same content. Then, when you specify the locale context at runtime, you can show locale-specific content to your site visitors.
To index localized content, you'll need to define available locales and configure locale extractors.
Ensuring that localized versions of a content piece share the same ID
When you create a source for localized content, you have to ensure that index documents for localized versions of the same content items share the same ID. This means that you have to explicitly configure how to extract the id attribute when you have localized content.
For example, your company's About Us page is available in six locales, including English (US). If you don't configure how to extract the id attribute, Search generates six different IDs for the six About Us index documents. This creates a problem when you configure anything that uses the ID of an index document. For example, many pin rules are based on pinning a content item with a specific ID to a specific slot. If you want to pin About Us, and you use the ID of the English (US) version, only users in the English (US) locale will see that content item pinned. Users in the other five locales will not see a localized version of About Us pinned. To avoid this problem, always ensure that localized versions of the same content have the same ID.
Locale extractors
In Sitecore Search, when you configure a source to crawl and index localized content, you must add the locale to the metadata of each index document. To do this, configure a locale extractor.
Configure the following settings to define how the crawler extracts the locale information from a content item:
Settings |
Description |
---|---|
Name |
A meaningful name for the locale extractor. |
Extractor Type |
The type of locale extractor you want to use. You can use:
|
URLs to Match |
This is an optional setting. |
JavaScript locale extractor
Add a JS function to extract locales from each page.
Header locale extractor
Add the header key
whose value
you want to use as locale. If the advanced web crawler does not find this key in the request header, it looks for it in the response header.
For example, if you add Accept-Language as the header, the crawler looks for the key
Accept-Language and uses the value
as the locale for that document. If the request header is Accept-Language: es-ES, the index document has metadata that tags it as an es-ES (Spain, Spanish) document.