Skip to main content

Configure the robots.txt file

Abstract

How to configure the robots.txt file for your domain or subdomain.

The robots.txt file is located in the root folder of your website and controls which files on your website that the search engines can index. The robots.txt file consists of rules that either allow or block access for a particular crawler to a file path on the domain or subdomain where the robots.txt file is hosted.

Important

If you do not add any rules, the following code is written to the file:

User-agent: *
Disallow: /

This means that no crawlers can access the content.

To configure the robots.txt file:

  1. In the content tree, navigate to your site and click the Settings item.

  2. Scroll down to the Robots section and, in the Robots content field, enter the rules.

    Configure the robots.txt file
  3. Save and publish the changes.

Example

In the following example, the website is called http://www.mywebsite.com, and you want to instruct all search engines not to index any of the content in the ignorethesepages folder:

User-agent: *
Disallow: /ignorethesepages/

Note

You do not have to specify where the sitemap.xml is located. SXA adds this information to the robots.txt file automatically.