The Complete XML Sitemap Guide: Creation, Validation, and Best Practices
Everything you need to know about XML sitemaps. Covers format, creation methods, sitemap types, validation, submission to search engines, and best practices.
What Is a Sitemap?
A sitemap is a file that lists the pages on your website so search engines can find, crawl, and index them more efficiently. Think of it as a table of contents for your site, written in a language that Google, Bing, and other crawlers understand natively.
There are two distinct types of sitemaps, and they serve very different audiences.
XML Sitemaps
An XML sitemap is a structured file designed for search engines. It lives at a URL like https://example.com/sitemap.xml and contains a machine-readable list of every page you want indexed. Each entry can include metadata like the last modification date, how often the page changes, and its relative importance on your site.
This is the type that matters for SEO. When people say "sitemap" in the context of search engine optimization, they almost always mean an XML sitemap.
HTML Sitemaps
An HTML sitemap is a regular web page, built for humans, that links to every major section or page on your site. It helps visitors navigate large sites. You have probably seen them in website footers, usually labelled "Site Map."
HTML sitemaps can indirectly help SEO by improving internal linking, but they are not a substitute for an XML sitemap. For a deeper comparison, see our HTML vs XML sitemap breakdown.
Why the Distinction Matters
Search engines rely on XML sitemaps to discover URLs they might otherwise miss. Large sites, new sites, sites with orphan pages, or pages behind complex JavaScript rendering all benefit from having an XML sitemap. An HTML sitemap alone will not cut it.
For the rest of this guide, "sitemap" refers to an XML sitemap unless stated otherwise. If you are just getting started, our what is a sitemap article covers the basics.
How Search Engines Use Sitemaps
Search engines do not blindly index every URL in your sitemap. A sitemap is a hint, not a directive [1]. Here is what actually happens when a crawler encounters your sitemap.
Discovery
Crawlers find your sitemap in one of three ways:
- robots.txt directive. A
Sitemap:line in your robots.txt file points crawlers to your sitemap URL. This is the most common and recommended method. - Search console submission. You manually submit the sitemap URL through Google Search Console or Bing Webmaster Tools.
- Direct crawling. Some crawlers check common paths like
/sitemap.xmlor/sitemap_index.xmlautomatically.
Crawl Scheduling
Once a search engine knows about your sitemap, it uses the metadata (especially lastmod dates) to decide which pages to crawl first and how often to revisit them. Pages with recent lastmod values get prioritized. Pages that have not changed may be skipped for a while.
Indexing Decisions
Having a URL in your sitemap does not guarantee indexing. Google still evaluates page quality, duplicate content, crawl budget, and whether the page is blocked by robots.txt or noindex tags. But a sitemap ensures Google at least knows the URL exists, which is half the battle for large or complex sites.
Crawl Budget Considerations
Every site gets a limited crawl budget, the number of pages a search engine will crawl in a given time window. For small sites, this is rarely a concern. For sites with hundreds of thousands or millions of pages, crawl budget management becomes critical.
A well-maintained sitemap helps search engines allocate crawl budget efficiently. By listing only the pages you actually want indexed, you signal which URLs deserve attention. Without a sitemap, crawlers spend time discovering and evaluating pages you might not care about (faceted navigation pages, internal search results, expired promotions).
This is especially relevant for e-commerce sites where product pages are constantly being added, updated, and removed. Your sitemap becomes the single source of truth for which URLs are worth crawling right now.
What Sitemaps Cannot Do
A sitemap will not:
- Force a page to rank higher
- Override a noindex tag or robots.txt disallow rule
- Fix thin or duplicate content issues
- Compensate for poor site architecture
- Speed up indexing of low-quality pages
Sitemaps work best as part of a broader SEO strategy. They are a discovery mechanism, not a ranking factor. For more on how sitemaps and SEO intersect, read our sitemap for SEO guide.
XML Sitemap Format and Structure
The XML sitemap protocol is defined at sitemaps.org [2] and supported by Google, Bing, Yahoo, and other major search engines. The format is straightforward.
Basic Structure
Every XML sitemap starts with an XML declaration and a <urlset> root element that references the sitemap namespace:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-04-10</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2026-03-15</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
Each <url> element represents one page. Let's break down the tags.
Required Tags
<loc> is the only strictly required tag. It contains the full, absolute URL of the page.
<loc>https://example.com/products/widget</loc>
Rules for <loc>:
- Must be a fully qualified URL (including the protocol)
- Must match the canonical version of the URL
- Must be under 2,048 characters
- Must be properly encoded (spaces become
%20, ampersands become&)
Optional Tags
<lastmod> indicates when the page was last modified. Use W3C Datetime format:
<lastmod>2026-04-10T14:30:00+00:00</lastmod>
<!-- or simply -->
<lastmod>2026-04-10</lastmod>
Google has stated that lastmod is the most valuable optional tag, provided you keep it accurate [3]. If you set lastmod to the current date every time you regenerate your sitemap (without actual content changes), Google will learn to ignore it.
<changefreq> suggests how often the page content changes. Valid values are: always, hourly, daily, weekly, monthly, yearly, never.
<changefreq>weekly</changefreq>
Google has publicly stated it ignores changefreq [3]. Bing also gives it very low weight. You can include it for completeness, but do not rely on it.
<priority> is a value from 0.0 to 1.0 indicating the page's importance relative to other pages on the same site. The default is 0.5.
<priority>0.8</priority>
Like changefreq, Google ignores priority [3]. It was a good idea in theory, but site owners inevitably set everything to 1.0, making the signal useless.
For a detailed breakdown, see our guide on sitemap priority and changefreq.
Focus your energy on accurate lastmod dates. That is the one optional tag search engines actually pay attention to.
Character Encoding
Your sitemap must be UTF-8 encoded. Certain characters need entity escaping in XML:
| Character | Escape |
|-----------|--------|
| & | & |
| ' | ' |
| " | " |
| > | > |
| < | < |
For more real-world examples, check our XML sitemap examples collection.
Sitemap Types
The sitemap protocol supports several specialized extensions beyond standard page URLs.
Standard Sitemap
The format covered above. Lists regular HTML pages with loc, lastmod, changefreq, and priority. This is what most sites need.
Image Sitemap
Image sitemaps help search engines discover images, particularly useful when images are loaded via JavaScript or CSS and might not be found through normal crawling.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://example.com/products/widget</loc>
<image:image>
<image:loc>https://example.com/images/widget-front.jpg</image:loc>
<image:title>Widget front view</image:title>
</image:image>
<image:image>
<image:loc>https://example.com/images/widget-side.jpg</image:loc>
<image:title>Widget side view</image:title>
</image:image>
</url>
</urlset>
You can include up to 1,000 images per page entry [1].
Video Sitemap
Video sitemaps provide metadata about video content hosted on your pages. This helps Google display rich video results in search.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>https://example.com/videos/tutorial</loc>
<video:video>
<video:thumbnail_loc>https://example.com/thumbs/tutorial.jpg</video:thumbnail_loc>
<video:title>Widget Setup Tutorial</video:title>
<video:description>Learn how to set up your widget in 5 minutes.</video:description>
<video:content_loc>https://example.com/video/tutorial.mp4</video:content_loc>
<video:duration>300</video:duration>
</video:video>
</url>
</urlset>
News Sitemap
If your site is registered with Google News, a news sitemap helps articles appear in news results. News sitemaps should only include articles published in the last 48 hours.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">
<url>
<loc>https://example.com/news/breaking-story</loc>
<news:news>
<news:publication>
<news:name>Example News</news:name>
<news:language>en</news:language>
</news:publication>
<news:publication_date>2026-04-12T08:00:00+00:00</news:publication_date>
<news:title>Breaking: Major Discovery Announced</news:title>
</news:news>
</url>
</urlset>
Sitemap Index
When your site has more than 50,000 URLs (or your sitemap file exceeds 50MB), you split it into multiple sitemaps and reference them from a sitemap index file. We cover this in detail in the next section.
For a deeper look at image and video sitemaps specifically, see our video and image sitemaps resource.
Sitemap Index Files
Most small-to-medium sites will never need a sitemap index. But once your sitemap grows past the protocol limits, splitting becomes mandatory.
Size Limits
The sitemaps.org protocol defines two hard limits per sitemap file [2]:
- 50,000 URLs maximum per sitemap file
- 50MB maximum uncompressed file size per sitemap file
If you hit either limit, you need to split your URLs across multiple sitemap files and tie them together with a sitemap index.
Sitemap Index Format
A sitemap index file lists individual sitemap files instead of page URLs:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2026-04-10</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-products.xml</loc>
<lastmod>2026-04-12</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
<lastmod>2026-04-11</lastmod>
</sitemap>
</sitemapindex>
Practical Splitting Strategies
You do not have to wait until you hit 50,000 URLs to start splitting. Many sites split sitemaps by content type from the start:
sitemap-pages.xmlfor static pagessitemap-products.xmlfor product pagessitemap-blog.xmlfor blog postssitemap-categories.xmlfor category and tag pages
This makes debugging easier. If product pages are not getting indexed, you can check sitemap-products.xml in isolation.
Rules and Gotchas
- A sitemap index can reference up to 50,000 individual sitemaps [2].
- You cannot nest sitemap indexes (a sitemap index cannot point to another sitemap index).
- All sitemaps referenced in the index must be on the same host as the index file, or on a host that the index file's host is authorized to reference via Search Console.
- You only need to submit or reference the sitemap index URL. Search engines will find the individual sitemaps from there.
Our sitemap index explained article covers edge cases and advanced patterns.
How to Create a Sitemap
There are four main approaches to creating an XML sitemap. The right one depends on your site's size, platform, and how often content changes.
Manual Creation
For very small, static sites (under 20 pages), writing XML by hand is perfectly viable. Copy the basic structure from the format section above, list your URLs, and save it as sitemap.xml in your site's root directory.
This approach does not scale. The moment you add or remove pages regularly, manual maintenance becomes a liability.
CMS Plugins
If you use a content management system, a plugin is usually the easiest path.
WordPress: Yoast SEO and Rank Math both generate XML sitemaps automatically. WordPress itself has included a built-in sitemap at /wp-sitemap.xml since version 5.5, though most SEO plugins override it with a more feature-rich version. See our WordPress sitemap guide for setup details.
Shopify: Shopify auto-generates a sitemap at /sitemap.xml for every store. It includes products, collections, blogs, and pages. You cannot directly edit it, but it updates automatically when you publish or unpublish content. Our Shopify sitemap guide explains the specifics and workarounds.
Other CMS platforms: Squarespace, Wix, Webflow, and most modern CMS platforms generate sitemaps automatically. Check your platform's SEO settings.
Online Sitemap Generators
Sitemap generators crawl your site and produce a sitemap file you can download and upload to your server. This works well for static sites or as a one-time setup.
The tradeoff: generated sitemaps are snapshots. They go stale the moment you add new content. For sites that change frequently, an automated solution is better.
To understand what these tools do under the hood, read what is a sitemap generator.
Programmatic Generation
For custom-built sites, generating your sitemap from code gives you the most control.
Next.js example using the App Router:
// app/sitemap.ts
import { MetadataRoute } from 'next';
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const posts = await getAllPosts(); // your data source
const postEntries = posts.map((post) => ({
url: `https://example.com/blog/${post.slug}`,
lastModified: new Date(post.updatedAt),
changeFrequency: 'weekly' as const,
priority: 0.7,
}));
return [
{
url: 'https://example.com',
lastModified: new Date(),
changeFrequency: 'daily',
priority: 1.0,
},
...postEntries,
];
}
Python example using lxml:
from lxml import etree
from datetime import date
NAMESPACE = "http://www.sitemaps.org/schemas/sitemap/0.9"
nsmap = {None: NAMESPACE}
urlset = etree.Element("urlset", nsmap=nsmap)
pages = [
{"loc": "https://example.com/", "lastmod": "2026-04-10", "priority": "1.0"},
{"loc": "https://example.com/about", "lastmod": "2026-03-15", "priority": "0.8"},
]
for page in pages:
url_el = etree.SubElement(urlset, "url")
for tag, value in page.items():
child = etree.SubElement(url_el, tag)
child.text = value
tree = etree.ElementTree(urlset)
tree.write("sitemap.xml", xml_declaration=True, encoding="UTF-8", pretty_print=True)
Node.js example:
const { SitemapStream, streamToPromise } = require('sitemap');
const { createWriteStream } = require('fs');
const links = [
{ url: '/', lastmod: '2026-04-10', changefreq: 'daily', priority: 1.0 },
{ url: '/about', lastmod: '2026-03-15', changefreq: 'monthly', priority: 0.8 },
{ url: '/blog', lastmod: '2026-04-12', changefreq: 'daily', priority: 0.9 },
];
const stream = new SitemapStream({ hostname: 'https://example.com' });
const writeStream = createWriteStream('./public/sitemap.xml');
stream.pipe(writeStream);
links.forEach((link) => stream.write(link));
stream.end();
For dynamic frameworks, check our dynamic sitemaps guide. For a broader overview of creation methods, see how to create a sitemap and how to create an XML sitemap.
Submitting Your Sitemap to Search Engines
Creating a sitemap is only half the job. You need to tell search engines where to find it.
Google Search Console
- Sign in to Google Search Console.
- Select your property.
- Navigate to Sitemaps in the left sidebar (under "Indexing").
- Enter your sitemap URL (e.g.,
https://example.com/sitemap.xml). - Click Submit.
Google will report the status: success, errors, or warnings. It will also show how many URLs were discovered and how many are indexed over time.
After submission, Google re-fetches your sitemap periodically. You do not need to resubmit every time you update it.
Bing Webmaster Tools
- Sign in to Bing Webmaster Tools.
- Select your site.
- Go to Sitemaps under "Configure My Site."
- Enter your sitemap URL and click Submit.
Bing also supports anonymous sitemap submission via a simple HTTP request [4]:
GET https://www.bing.com/ping?sitemap=https://example.com/sitemap.xml
The robots.txt Sitemap Directive
This is the simplest and most universal method. Add a Sitemap line to your robots.txt file:
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
Every major search engine reads this directive. It works for search engines you have not explicitly submitted to, and it persists without any account or dashboard setup.
You can list multiple sitemaps:
Sitemap: https://example.com/sitemap-pages.xml
Sitemap: https://example.com/sitemap-blog.xml
For the full story on this approach, check robots.txt and sitemap and the detailed guide on how to add a sitemap to robots.txt over at RobotsTxtTest.
Use both methods: submit through Search Console for monitoring and error reporting, and add the robots.txt directive for broad discovery.
Our step-by-step walkthrough covers the entire process: how to submit a sitemap to Google.
Sitemap Validation and Testing
A malformed sitemap can quietly prevent search engines from reading your URLs. Validation catches these problems before they cost you traffic.
What to Validate
- Well-formed XML. Missing closing tags, incorrect nesting, or invalid characters will break the entire file.
- Correct namespace. The
<urlset>must referencehttp://www.sitemaps.org/schemas/sitemap/0.9. - Valid URLs. Every
<loc>should return a 200 status code. No 404s, no redirects, no 500 errors. - Protocol compliance. URLs should match your canonical protocol (HTTPS if that is what you use).
- File size. Under 50MB uncompressed, under 50,000 URLs.
- Consistent
lastmodformat. Use W3C Datetime consistently (eitherYYYY-MM-DDor the full ISO 8601 format).
Validation Methods
Google Search Console is the most authoritative validator. After submitting your sitemap, it reports errors directly. Common reports include "URL is not on this property," "URL blocked by robots.txt," and parsing errors.
Online validators check XML syntax and protocol compliance without needing Search Console access. These are useful during development or for quick checks.
Command-line validation using xmllint works well in CI/CD pipelines:
xmllint --noout --schema sitemap.xsd sitemap.xml
Automated monitoring catches regressions. Set up a scheduled check that fetches your sitemap, validates the XML, and verifies that key URLs return 200 status codes.
For a full comparison of validation tools, see sitemap checker tools compared and our sitemap validator guide.
Common Sitemap Errors and Fixes
These are the issues that come up again and again. Most are easy to fix once you know what to look for.
URLs Returning Non-200 Status Codes
Your sitemap should only contain URLs that return a 200 response. Including URLs that 301 redirect, 404, or 500 wastes crawl budget and signals poor site maintenance.
Fix: Audit your sitemap URLs regularly. Remove any that redirect or error out. If a page has moved permanently, update the <loc> to the new URL rather than pointing to the old one. Be aware that redirect chains in sitemaps are a particularly bad signal.
Protocol Mismatch (HTTP vs HTTPS)
If your site runs on HTTPS but your sitemap lists HTTP URLs (or vice versa), search engines may treat them as different pages or ignore them.
Fix: Every URL in your sitemap must use the same protocol as your canonical URLs. If you have migrated to HTTPS, update every <loc> entry.
Including Noindex or Blocked URLs
Listing a URL in your sitemap while simultaneously blocking it with noindex or robots.txt sends mixed signals. Google will flag this as a conflict.
Fix: If you do not want a page indexed, remove it from the sitemap. Do not rely on the sitemap to override noindex/disallow directives (it cannot).
Stale lastmod Dates
Setting every lastmod to today's date, or never updating lastmod after content changes, teaches search engines to ignore the field entirely for your site.
Fix: Only update lastmod when the page content actually changes. Automate this from your CMS or database timestamps.
Incorrect XML Encoding
Unescaped ampersands, angle brackets, or other special characters in URLs break XML parsing. The entire sitemap becomes unreadable.
Fix: Entity-encode all special characters. & becomes &, query strings like ?a=1&b=2 become ?a=1&b=2.
File Too Large
Exceeding the 50MB or 50,000 URL limits means the sitemap will be partially or fully ignored.
Fix: Split into multiple files and use a sitemap index.
Wrong Namespace
Using an incorrect or missing namespace URL in the <urlset> element causes parsing failures.
Fix: Always use xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" for the main namespace.
For a complete troubleshooting reference, see sitemap errors and fixes. To check your own sitemap for these issues, read how to check a sitemap.
Platform-Specific Guides
WordPress (Yoast SEO)
Yoast SEO generates an XML sitemap index at https://yoursite.com/sitemap_index.xml. It automatically splits sitemaps by post type (posts, pages, categories) and respects noindex settings.
Key settings:
- Go to Yoast SEO > General > Features and ensure "XML sitemaps" is toggled on.
- To exclude specific post types or taxonomies, go to Yoast SEO > Search Appearance and set unwanted types to "No" for "Show in search results." They will be excluded from the sitemap automatically.
- To exclude individual pages, edit the page and set it to noindex in the Yoast meta box.
Yoast handles lastmod automatically based on post modification dates. It does not include changefreq or priority because Google ignores them.
For full WordPress coverage, see our WordPress sitemap guide.
Shopify
Shopify generates your sitemap automatically at https://yourstore.com/sitemap.xml. This is actually a sitemap index pointing to individual sitemaps for products, collections, blogs, and pages.
Limitations to be aware of:
- You cannot manually edit the sitemap.
- Shopify does not include metaobject pages or custom landing pages built with some page builders.
- Unpublished products are automatically excluded.
- The sitemap updates when content is published or unpublished, but there can be a slight delay.
If you need to include URLs that Shopify misses, some apps and workarounds exist. Our Shopify sitemap guide details those options.
Next.js
Next.js (App Router, version 13+) supports sitemap generation natively.
Static sitemap: Create app/sitemap.xml/route.ts or simply export a sitemap() function from app/sitemap.ts (as shown in the code example above).
Dynamic sitemap with generateSitemaps: For sites with many pages, Next.js supports splitting into multiple sitemaps:
// app/sitemap.ts
export async function generateSitemaps() {
const totalProducts = await getProductCount();
const sitemapCount = Math.ceil(totalProducts / 50000);
return Array.from({ length: sitemapCount }, (_, i) => ({ id: i }));
}
export default async function sitemap({ id }: { id: number }) {
const start = id * 50000;
const products = await getProducts({ start, limit: 50000 });
return products.map((product) => ({
url: `https://example.com/products/${product.slug}`,
lastModified: product.updatedAt,
}));
}
Next.js handles the sitemap index generation automatically when you use generateSitemaps. Our dynamic sitemaps guide goes deeper on framework-specific patterns.
Sitemap Best Practices and SEO
These practices come from years of collective experience and direct guidance from search engine documentation.
Only Include Canonical, Indexable URLs
Every URL in your sitemap should be the canonical version of that page and should return a 200 status code. Do not include:
- URLs with query parameters that create duplicate content
- Non-canonical URL variants
- Paginated pages (unless each page has unique content worth indexing)
- URLs blocked by robots.txt or marked noindex
Keep lastmod Accurate
This bears repeating. Accurate lastmod dates are the single most impactful thing you can do with your sitemap after listing the correct URLs. Tie lastmod to actual content changes in your database or CMS.
Use HTTPS URLs
If your site supports HTTPS (it should), use HTTPS URLs exclusively in your sitemap. Do not mix protocols.
Place Your Sitemap at the Root
Convention is https://example.com/sitemap.xml. While the protocol allows sitemaps at any path, placing it at the root makes it easy to find and is what search engines check first. Reference it from robots.txt regardless of where it lives.
Compress Large Sitemaps
You can gzip your sitemap files. Search engines handle .xml.gz files without issue. This reduces bandwidth and transfer time, especially for large sitemaps.
Sitemap: https://example.com/sitemap.xml.gz
Handle Multilingual Sites with Hreflang
For sites serving content in multiple languages or targeting multiple regions, include hreflang annotations in your sitemap:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://example.com/page</loc>
<xhtml:link rel="alternate" hreflang="en" href="https://example.com/page" />
<xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/page" />
<xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
</url>
</urlset>
This is an alternative to placing hreflang tags in your HTML <head>. For large multilingual sites, the sitemap approach is often more manageable. See the detailed guide on hreflang in XML sitemaps at HreflangGenerator, and our own hreflang sitemaps guide.
Update Sitemaps When Content Changes
A sitemap that is generated once and never updated provides diminishing value over time. Automate your sitemap generation so new pages appear and removed pages disappear without manual intervention. Our guide on how to update a sitemap walks through the options.
Do Not Include Non-Canonical URL Variants
If you have both https://example.com/page and https://example.com/page/ (trailing slash variant), pick one as canonical and only include that version in your sitemap. The same applies to www vs. non-www variants, HTTP vs. HTTPS, and URLs with tracking parameters. Duplicate URLs in your sitemap dilute crawl budget and confuse indexing signals.
Keep Your Sitemap Lean
A common mistake is to include every URL the CMS generates, including tag pages, author archives, date-based archives, and paginated listing pages. Ask yourself: does this URL provide unique value to someone arriving from a search result? If not, leave it out. A sitemap with 500 high-quality URLs outperforms one with 50,000 URLs padded with thin archive pages.
Monitor Sitemap Health
Check your sitemap in Search Console periodically. Look for:
- Increasing error counts
- Submitted vs. indexed URL gaps
- Crawl date freshness
- Pages discovered but not yet indexed
Declining index coverage relative to submitted URLs can signal quality issues with your content or technical problems with your site. A sudden spike in errors after a deployment usually means a code change broke your sitemap generation.
For the full list of best practices, read our dedicated sitemap best practices article and the comprehensive sitemap SEO guide.
A sitemap is a living document. Treat it like code: automate generation, validate regularly, and monitor it in Search Console.
When You Don't Need a Sitemap
Not every site needs an XML sitemap. Google's own documentation acknowledges this [1]. Here are the cases where a sitemap is low priority or unnecessary.
Small Sites (Under 500 Pages)
If your site has fewer than 500 pages and they are all well-linked through your navigation, search engines will find every page through normal crawling. A sitemap adds minor value but is not critical.
Fully Linked Sites
If every page on your site is reachable within a few clicks from the homepage through internal links, crawlers can discover them without help. This is the ideal site architecture regardless of whether you use a sitemap.
No Search Traffic Goals
If your site is an internal tool, behind authentication, or simply does not need organic search traffic, skip the sitemap entirely.
Single-Page Applications (SPAs)
Ironically, SPAs and JavaScript-heavy sites are among the sites that benefit most from sitemaps. If your content is rendered client-side, crawlers may not execute JavaScript thoroughly enough to discover all your internal links. A sitemap bypasses the rendering problem entirely by giving crawlers a flat list of every URL.
When You Still Should
Even if your site is small, consider a sitemap if:
- Your site is brand new and has few external backlinks. New sites have no crawl history, so search engines have no reason to visit frequently. A sitemap jumpstarts the discovery process.
- You use JavaScript rendering that might prevent crawlers from discovering links.
- You have orphan pages (pages not linked from other pages on your site). These are invisible to crawlers without a sitemap.
- Your site has media content (images, videos) that you want indexed.
- You publish time-sensitive content and want fast discovery.
- You recently migrated domains or restructured URLs and need search engines to learn the new layout quickly.
The general rule: if you are unsure, create one. The overhead is minimal and there is no downside to having a sitemap. A small, accurate sitemap takes minutes to set up and can save you weeks of waiting for search engines to find new pages on their own.
Finding Existing Sitemaps
Before building a new sitemap, check if your site already has one. Common locations:
https://yoursite.com/sitemap.xmlhttps://yoursite.com/sitemap_index.xmlhttps://yoursite.com/wp-sitemap.xml(WordPress default)- Check your
robots.txtfor aSitemap:directive
You can also look at what competitors are doing with their sitemaps for structural inspiration. Our guide on how to find the sitemap of a website covers all the methods.
Wrapping Up
An XML sitemap is one of the simplest, highest-leverage technical SEO tools available. The protocol is well-defined, the format is straightforward, and every major search engine supports it.
Here is what matters most:
- List only canonical, indexable URLs that return 200 status codes.
- Keep
lastmodaccurate so search engines trust your signals. - Submit through Search Console and reference from robots.txt.
- Validate regularly to catch errors before they impact crawling.
- Automate generation so your sitemap stays current without manual work.
Whether you are managing a 10-page portfolio site or a million-page e-commerce store, the principles are the same. Start simple, automate early, and monitor consistently.
Validate your sitemap instantly
Check your XML sitemap for errors, broken URLs, and protocol compliance. Free instant validation.
Try Instant SitemapReferences
- Google Search Central. "Build and submit a sitemap." https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap
- sitemaps.org. "Sitemaps XML Format." https://www.sitemaps.org/protocol.html
- Google Search Central. "Learn about sitemaps." https://developers.google.com/search/docs/crawling-indexing/sitemaps/overview
- Bing Webmaster Tools. "Bing Webmaster Guidelines." https://www.bing.com/webmasters/help/sitemaps-3b5cf6ed