Glossary

Sitemap

Definition: A sitemap is an XML or HTML file that lists a website's important URLs, helping search engines discover and crawl pages efficiently.

A sitemap is a file that provides search engine crawlers with a structured list of all the important URLs on your website. The most common format is an XML sitemap (sitemap.xml), though HTML sitemaps can also help users navigate large sites.

XML Sitemap Structure

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/page</loc>
    <lastmod>2025-01-15</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>

Sitemap Index Files

Large sites (50,000+ URLs) split their sitemap into multiple files and reference them all from a sitemap index file at /sitemap-index.xml.

Submitting to Search Engines

  • Google — Submit in Google Search Console under Sitemaps.
  • Bing — Submit in Bing Webmaster Tools.
  • robots.txt — Add Sitemap: https://example.com/sitemap.xml for auto-discovery.

What to Include

Only include canonical, indexable pages. Exclude 404 pages, redirects, paginated pages (usually), and pages blocked by noindex.