Crawler authentication
This topic only covers authentication configuration required in your crawler settings when your content needs authentication before access. For information on how to authenticate with Sitecore Search source when you integrate your website or app, see API authentication and authorization.
If your original content requires authentication before the crawler can access it, you can define authentication settings. You can add a key, access token, or password to the source configuration. In Search, this mechanism is available in Source Settings > Authentication.
Basic authentication settings
Use basic authentication when your original content requires a key or access token that you add to the request header.
Configure the following basic authentication settings:
Setting |
Description |
---|---|
Authentication Type |
Type of authentication you want to use. Select Basic. |
URL |
The URL where your source content requires authentication. For example, enter www.acme.com/login. If you use a request trigger, the URL is usually the same as the request URL. |
BODY |
Body of the request. Use this when you send a |
METHOD |
HTTP method of the request. You can use GET, POST, PUT, or PATCH. |
TIMEOUT |
Time, in milliseconds, the crawler waits to get a response from the URL. If the TIMEOUT expires before the crawler gets a response, the crawler does not crawl the URL. |
HEADERS |
Authorization header that describes the user-agent used to authenticate when accessing your source content. Set as key and value. For example, enter the key as authorization and the value as the key or access token required for the source content. You can add multiple headers. |
Browser authentication
Use browser authentication when your website requires a GUI-based username and password, rather than a key or access token in the request header. If visitors need to enter a username and password to access content, you'll need browser authentication.
Configure the following browser authentication settings:
Setting |
Description |
---|---|
Authentication Type |
Type of authentication you want to use. Select Browser. |
URL |
The URL where your website requires authentication. Usually, this is the login page. For example, enter www.acme.com/login If you use a request trigger, the URL is usually the same as the request trigger URL. |
USERNAME SELECTOR |
CSS notation for the username selector field. For example, this can be the Username, USERNAME or EMAIL, or Enter email field on your content login page. To get the CSS notation value run an inspect element on the username field on the browser. We recommend that you add more than one username selector to make sure that the crawler finds the right username field. For example, to use the RequestResponse
|
USERNAME VALUE |
Username your website expects, in plain text. |
PASSWORD SELECTOR |
CSS notation for the password selector field. For example, this can be the Password or Enter password field on your content login page. To get the CSS notation value, run an inspect element on the password field in the browser. We recommend that you add more than one password selector to make sure that the crawler finds the correct password field. For example, to use the RequestResponse
|
PASSWORD VALUE |
Password your website expects, in plain text. |
SUBMIT SELECTOR |
CSS notation for the submit selector field. For example, this can be the Login, Submit , or Sign in button on your website's login page. To get the CSS notation value, right-click the field and run an inspect element. We recommend that you add more than one submit selector to make sure that the crawler finds the right submit field. For example, to use the RequestResponse
|
MIN WAIT |
Minimum time, in milliseconds, that the crawler waits to get a response from the URL. |
MAX WAIT |
Maximum time, in milliseconds, that the crawler waits to get a response from the URL. If the MAX WAIT expires before the crawler gets a response, the crawler does not crawl your content. |