An XML sitemap is a structured file in Extensible Markup Language (XML) format that lists the URLs of a website's pages to help search engines discover, crawl, and index content more efficiently. Following the Sitemap Protocol — developed by Google, Yahoo, and Microsoft — XML sitemaps tell search engine crawlers about the pages on a site, when they were last updated, how frequently they change, and their relative priority compared to other pages. XML sitemaps are the machine-readable counterpart to HTML sitemaps (which are designed for human visitors).

What an XML Sitemap Contains

An XML sitemap file contains a list of URL entries, each wrapped in <url> tags. The required field is <loc> (the full URL of the page). Optional fields include: <lastmod> (when the page was last modified, in YYYY-MM-DD format), <changefreq> (how often the page changes — always, hourly, daily, weekly, monthly, yearly, never), and <priority> (relative importance from 0.0 to 1.0, with 0.5 as default). A single XML sitemap can contain up to 50,000 URLs and must be under 50MB uncompressed — larger sites use sitemap index files that point to multiple individual sitemaps. XML sitemaps are submitted to Google via Google Search Console and referenced in robots.txt.

Best practice: Only include canonical, indexable pages in your XML sitemap. Avoid including URLs with noindex tags, redirect URLs, duplicate pages, or low-quality pages — including such URLs sends mixed signals to Google and reduces the effectiveness of the sitemap.

Why It Matters for SEO

XML sitemaps are a core technical SEO tool with clear practical benefits:

  • Helps search engines discover new or updated pages faster, particularly for large sites
  • Essential for deep pages not well-connected by internal links
  • Required for new sites with few external links to ensure comprehensive indexing
  • Google Search Console shows which sitemap URLs were crawled and indexed vs. discovered vs. excluded
  • Specialized sitemaps (image, video, news) help rich media content appear in relevant Google search features