Configure a sitemap in Channels
A sitemap is a file that provides information about the pages of your website, and the relationships between them.
Sitemaps help search engines like Google crawl your site more efficiently by listing all the important URLs. By providing a clear structure of your website, sitemaps can improve your site's visibility in search results. This is particularly useful for large websites. In addition, sitemaps can include metadata about each URL, such as when it was last modified, how often it changes, and its importance relative to other URLs on the site.
Overall, a sitemap is a valuable tool for both search engines and users, helping to ensure that your website is easily discoverable and navigable.
By default, the sitemap is generated for the whole site and stored in a file. In the sitemap, each page is represented by a URL element, and these can be configured. You can also exclude specific pages from the sitemap.
Sitemap generation uses Content Management data to list publishable items and does not verify which ones are actually published. This means pages or language versions in a final workflow state may appear in the sitemap even if not published. Using a workflow can prevent unpublished pages from being included.
This topic describes how to configure the sitemap of a site using the sites Portfolio. You can also configure the sitemap using the Content Editor instead.
Configure the sitemap
To configure the sitemap for your site:
-
In Channels, click the Options menu
> Settings on the tile of the site you want to manage. -
To ensure that the sitemap link is generated properly for your production sites, you must configure the target hostname for the site host. On the Site hosts tab, click the name of the site host in the table, then in the Configuration section, enter a target hostname. This field specifies the base URL that will be used in the sitemap entries. For example, if your site is accessible at www.example.com, you should enter www.example.com in this field.
-
In the left-hand pane, click Sitemap configuration.
-
Specify the following information for the following attributes:
-
Alternate links - alternate links in a sitemap are used to indicate different language versions of the same page. This helps search engines understand the relationship between these pages and serve the correct version to users based on their language preferences.
-
URLset attributes - help search engines crawl sites efficiently:
lastmodshows when a page was last updated,changefreqsuggests update frequency, andpriorityhighlights page importance. -
Refresh threshold - specifies the minimum time interval (in minutes) between the sitemap regeneration operations. For example, if the refresh threshold is set to 60, the sitemap will be regenerated every hour.
-
Index - a sitemap index file is a special type of sitemap that lists multiple sitemaps.
-
Configure alternate links
When configuring a sitemap, you can add xhtml:link elements to the URL elements to specify the alternate language versions of the page. Here's an example of how the hreflang attribute might look:
-
hreflang="x-default"indicates the default version of the page. -
hreflang="en"andhreflang="da"indicate the English and Danish versions of the page, respectively.
To configure alternate links in the Sitemap configuration page:
-
Click Generate alternate links to add
xhtml:linkelements to theURLelements in the sitemap. -
Click Include x-default to add the
xhtml:linkelement withhreflangset tox-defaultto theURLelement. The x-default value signals to the search algorithm that the page does not target any specific language or region.
Configure URLset attributes
Each URL is represented by a url element, which can have several attributes to provide additional information to search engines. The only required attribute is loc, which represents the URL of the page.
You can configure the other attributes in the Sitemap configuration page:
-
Click lastmod to specify the date when the page was last modified. This helps search engines understand how frequently the content changes.
-
Click changefreq to indicate how frequently the content at the URL is likely to change (for example, daily, weekly, monthly).
-
Click priority to specify a number between 0 and 1 that represents the importance of a specific page relative to other URLs on your site.
Configure the sitemap generation refresh threshold
To avoid excessive sitemap regenaration during frequent publishing operations, the refresh threshold sets the minimum interval between rebuilds. This helps manage server load and prevents the sitemap from being regenerated too often.
To change the refresh threshold:
-
In the Sitemap configuration page, click the arrows to specify a time in minutes.
When a sitemap is generated, it won't be regenerated until the time you specified has elapsed.
Create an index for the sitemap
A sitemap index file is a special type of sitemap that lists multiple sitemaps. This is particularly useful for large websites that need to split their content into multiple sitemaps due to size or URL limits. The sitemap index file helps search engines find and crawl all the individual sitemaps efficiently.
In Sites, you can specify the maximum number of pages to be included in a sitemap. If the number of pages on the site exceeds the limit, then the URL entries are divided into multiple sitemaps, and a sitemap index is generated that links to these sitemaps. This limit is set to 50,000 pages by default.
To specify the maximum number of pages that you want included in a single sitemap:
-
In the Sitemap configuration page, click the arrows to specify the maximum number of pages allowed in a single sitemap.
If you leave the field undefined, all URL entries are rendered into a single sitemap.
Disable the building of sitemaps
By default, the sitemap is generated for the whole site and stored in a file. Some of your sites do not need a sitemap, for example, sites that are on staging environments, or shared sites that are not meant to ever go live, as such.
To disable building sitemaps:
-
In Channels, click the Options menu
> Settings on the tile of the site you want to manage. -
In the left-hand pane, click Sitemap configuration.
-
Turn off the Generate sitemap switch.
Exclude a page from the sitemap
By default, pages that are excluded from publication by site publication restrictions or approval workflows are not included in the sitemap.
However, you might want to exclude additional pages from the sitemap that have been approved and published, for example, the error 404 page. You can exclude a specific page using the Content Editor.