Sitemap Strategy for E-Commerce Sites
Why e-commerce sites need a deliberate sitemap strategy. Covers product pages, category URLs, faceted navigation, out-of-stock handling, sitemap indexes, and lastmod accuracy.
E-commerce sites are some of the hardest to get right when it comes to sitemaps. A 20-page brochure site can get away with a basic XML file that never changes. An online store with 5,000 products, 200 categories, and endless filter combinations? That needs a plan.
The core problem is volume. Large catalogs produce thousands of URLs, and many of those URLs are duplicates or near-duplicates created by faceted navigation. Without a clear sitemap strategy, you end up pointing search engines at a mess of filter pages while your actual product and category pages get lost in the noise.
If you are new to sitemaps entirely, start with our complete XML sitemap guide and come back here for the e-commerce specifics.
Why E-Commerce Sites Need Sitemaps
Most small sites can rely on internal linking alone for discovery. Search engine crawlers follow links from your homepage, find your pages, and index them. That works when you have a flat structure with a few dozen pages.
E-commerce sites break that model in several ways.
Deep product pages. A product buried three or four clicks from the homepage might not get crawled regularly. If it is only reachable through a category listing that itself paginates across 50 pages, the crawler may never get there.
New products added frequently. If you add products daily or weekly, those new pages need to be discovered quickly. Waiting for a crawler to stumble across them through internal links can take days or weeks.
Crawl budget pressure. Google allocates a limited crawl budget to each site. If the crawler wastes time on filter URLs and paginated duplicates, your actual product pages get less attention. A sitemap tells the crawler exactly which pages matter.
Seasonal and promotional pages. Sale landing pages, holiday collections, and limited-time offers need fast indexing. A sitemap with accurate lastmod dates signals that these pages are new or recently updated.
What to Include in Your E-Commerce Sitemap
The goal is simple: include every URL you want indexed, and exclude everything else.
Product Pages
Every active, in-stock product with a unique URL belongs in your sitemap. These are your money pages. Each entry should include:
- The canonical URL (not a variant with tracking parameters)
- An accurate
lastmoddate reflecting when the product was last meaningfully updated (price change, description edit, stock status change)
<url>
<loc>https://store.example.com/products/blue-running-shoes</loc>
<lastmod>2026-04-10T14:30:00+00:00</lastmod>
</url>
Category Pages
Your top-level and second-level category pages should be in the sitemap. These are the pages that group products and often rank for broader search terms like "running shoes" or "kitchen knives."
Include the canonical version of each category page. If your category pages paginate (page 1, page 2, page 3), include page 1 only unless you have a specific reason to include deeper pages.
Brand and Collection Pages
If your store has dedicated brand pages or curated collections, include them. These often rank well for branded searches.
Informational Pages
Product guides, buying guides, sizing charts, and FAQ pages are valuable content. Include them in your sitemap alongside your product pages.
What to Exclude
This is where most e-commerce sitemaps go wrong. Including too much is worse than including too little, because it dilutes the signal you are sending to search engines.
Faceted Navigation URLs
Faceted navigation is the filtering system on category pages: color, size, price range, brand, rating. Each filter combination typically generates a unique URL like:
/shoes?color=blue&size=10&brand=nike&sort=price-asc
A category with 5 colors, 8 sizes, 10 brands, and 4 sort options can produce thousands of filter combinations. None of these belong in your sitemap. They are duplicate or near-duplicate content, and including them wastes crawl budget.
Instead, block the filter parameters from being crawled entirely. Use your robots.txt file to disallow the parameter patterns, and make sure your canonical tags point back to the clean category URL.
Internal Search Result Pages
URLs generated by your site search (like /search?q=blue+shoes) should never be in your sitemap. Search engines do not want to index your internal search results.
Cart, Checkout, and Account Pages
These are functional pages, not content pages. Exclude them from your sitemap and block them in robots.txt.
Duplicate Product URLs
If the same product is accessible at multiple URLs (through different categories or tracking parameters), include only the canonical URL. Do not include both /shoes/blue-running-shoes and /sale/blue-running-shoes if they show the same product.
Handling Out-of-Stock Products
This is one of the trickier decisions in e-commerce sitemap management. There are two reasonable approaches.
Keep them in the sitemap if the product page still has useful content, shows related alternatives, or will be back in stock soon. Many e-commerce SEO experts recommend keeping out-of-stock product pages indexed because they may have earned backlinks and search authority. The page should clearly indicate the product is unavailable and suggest alternatives.
Remove them from the sitemap if the product is permanently discontinued and the page returns a 404 or redirects. There is no point sending crawlers to a dead end.
The worst approach is leaving out-of-stock URLs in the sitemap while returning soft 404s or empty pages. That wastes crawl budget and provides a bad user experience.
A practical solution: keep a lastmod timestamp that updates when stock status changes. When a product goes out of stock, update the lastmod so search engines recrawl the page and see the updated status. When it comes back in stock, update lastmod again.
Sitemap Indexes for Large Catalogs
The sitemap protocol limits each sitemap file to 50,000 URLs and 50 MB uncompressed. A store with 100,000 products will hit that limit quickly when you add categories and informational pages.
The solution is a sitemap index -- a file that points to multiple individual sitemap files. Each child sitemap covers a segment of your catalog.
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://store.example.com/sitemaps/products-1.xml</loc>
<lastmod>2026-04-14T08:00:00+00:00</lastmod>
</sitemap>
<sitemap>
<loc>https://store.example.com/sitemaps/products-2.xml</loc>
<lastmod>2026-04-14T08:00:00+00:00</lastmod>
</sitemap>
<sitemap>
<loc>https://store.example.com/sitemaps/categories.xml</loc>
<lastmod>2026-04-12T12:00:00+00:00</lastmod>
</sitemap>
<sitemap>
<loc>https://store.example.com/sitemaps/pages.xml</loc>
<lastmod>2026-03-20T10:00:00+00:00</lastmod>
</sitemap>
</sitemapindex>
A good segmentation strategy for e-commerce:
- products-1.xml, products-2.xml, etc. -- Split alphabetically, by category, or by ID range
- categories.xml -- All category and subcategory pages
- pages.xml -- Informational pages, guides, blog posts
- brands.xml -- Brand landing pages (if applicable)
This structure makes it easy to update individual sitemaps when their content changes without regenerating the entire set.
Keeping lastmod Accurate
The lastmod tag is one of the few sitemap fields that Google actually pays attention to. But only if it is accurate. If every URL in your sitemap has today's date, Google will quickly learn to ignore your lastmod values entirely.
For e-commerce, track these changes as meaningful updates worth reflecting in lastmod:
- Price changes
- Description or title edits
- Stock status changes (in stock to out of stock, or vice versa)
- New product images
- Review count milestones (first review, significant new reviews)
Do not update lastmod for trivial changes like a site-wide footer update or a CSS tweak. Those are not meaningful content changes.
Read more about how search engines interpret these signals in our guide to sitemap priority and changefreq.
Automate lastmod with your database
Most e-commerce platforms store an updated_at timestamp on product records. Use that directly as your lastmod value. It is the easiest way to keep your sitemap accurate without building a separate tracking system.
Platform-Specific Notes
Shopify
Shopify generates a sitemap automatically at /sitemap.xml. It includes products, collections, pages, and blog posts. You cannot directly edit the sitemap file, but you can control what appears in it by managing which pages are set to be indexed. For more details, see our Shopify sitemap guide.
WooCommerce
Use a plugin like Yoast SEO or Rank Math to generate your sitemap. Both handle product pages, categories, and tags automatically. Configure the plugin to exclude tag archives and filtered URLs.
Custom Platforms
If you are running a custom e-commerce platform, generate your sitemap dynamically from your product database. Our dynamic sitemaps guide covers implementation in Next.js, Django, Rails, and Laravel.
A Practical E-Commerce Sitemap Checklist
Before you submit your sitemap to search engines, run through this list:
- Every active product page is included with its canonical URL
- Category pages (top-level and key subcategories) are included
- Faceted navigation URLs are excluded
- Internal search result pages are excluded
- Cart, checkout, and account pages are excluded
- Out-of-stock products are handled intentionally (kept or removed, not ignored)
- lastmod dates reflect actual content changes, not automated timestamps
- Total URLs per sitemap file stay under 50,000
- A sitemap index is used if you have multiple sitemap files
- The sitemap is referenced in your robots.txt file
- The sitemap has been submitted through Google Search Console
For a broader checklist that covers all site types, see our sitemap best practices.
References
- Google Search Central: Sitemaps Overview
- Sitemaps.org Protocol Specification
- Google Search Central: Large Site Management
Generate and validate your e-commerce sitemap
Check your store's sitemap for missing products, dead URLs, and indexing issues.
Try Instant Sitemap