Skip to content

Sitemaps for Next.js Sites: The Pattern That Keeps Google Indexed

Sitemaps aren’t optional. Here’s the pattern that ships with every FH client build.

John Cravey with EleviFounder5 min read

A sitemap tells Google every URL on your site that should be indexed. Without one, Google still finds most of your pages eventually — via internal links and external backlinks. With one, Google finds them in days instead of weeks, and the indexed-page count stays close to your published-page count instead of drifting. Every FH client site ships with an auto-generated sitemap. Here’s the pattern.

What a sitemap is (and isn’t)

It’s an XML file listing every URL on your site, with optional metadata: last modified, change frequency, priority. Crawlers (Google, Bing, anyone) read it to discover URLs they haven’t crawled yet and to re-prioritize URLs that recently changed.

It is not a ranking signal. URLs in a sitemap don’t rank better than URLs not in a sitemap. The sitemap only affects discovery and crawl prioritization. The ranking work is on-page SEO, content, and links.

Next.js App Router: app/sitemap.ts

Next.js generates a sitemap from a TypeScript file at `app/sitemap.ts`. Export a default function that returns an array of URL objects. Next handles the XML serialization, the routing (`/sitemap.xml`), and the caching.

// app/sitemap.ts
import type { MetadataRoute } from "next";
import { POSTS } from "@/lib/blog/posts";
import { LOCATIONS } from "@/lib/locations";
import { SERVICES } from "@/app/components/home/data";
import { slugify } from "@/lib/slug";

const BASE = "https://frontendhorizon.com";

export default function sitemap(): MetadataRoute.Sitemap {
  const now = new Date();
  const staticRoutes = [
    "",
    "/solutions",
    "/who-we-serve",
    "/portfolio",
    "/blog",
    "/contact",
  ].map((path) => ({
    url: `${BASE}${path}`,
    lastModified: now,
    changeFrequency: "monthly" as const,
    priority: 0.8,
  }));

  const posts = POSTS.map((p) => ({
    url: `${BASE}/blog/${p.slug}`,
    lastModified: new Date(p.updatedAt ?? p.publishedAt),
    changeFrequency: "monthly" as const,
    priority: 0.6,
  }));

  const locations = LOCATIONS.map((l) => ({
    url: `${BASE}/locations/${l.slug}`,
    lastModified: now,
    changeFrequency: "monthly" as const,
    priority: 0.7,
  }));

  const solutions = SERVICES.map((s) => ({
    url: `${BASE}/solutions/${slugify(s.name)}`,
    lastModified: now,
    changeFrequency: "monthly" as const,
    priority: 0.7,
  }));

  return [...staticRoutes, ...solutions, ...locations, ...posts];
}

Submitting to Search Console

GSC → Sitemaps → enter the URL → Submit. Google fetches it within an hour or so and starts crawling. Status: ‘Success’ means it parsed cleanly. ‘Couldn’t fetch’ usually means a 404 or 5xx; check the URL is live.

What lastModified actually does

Google uses lastModified to prioritize re-crawls. A URL with a recent lastModified gets crawled sooner. Don’t lie about it — Google has signals to detect when you’re claiming a recent modification on a page that hasn’t actually changed, and it stops trusting your sitemap.

Sitemap size limits

Each sitemap file can hold up to 50,000 URLs or 50MB uncompressed. Above that, use a sitemap index — a master sitemap that lists multiple sub-sitemaps. Most SMB sites are nowhere near these limits. We hit them once on a retail client with 80,000 product pages; the fix was a paginated sitemap.

// app/sitemap.ts — paginated
export function generateSitemaps() {
  // Split into chunks of 5000 URLs per sitemap
  return Array.from({ length: 17 }, (_, i) => ({ id: i }));
}

export default function sitemap({ id }: { id: number }): MetadataRoute.Sitemap {
  const start = id * 5000;
  const end = start + 5000;
  return PRODUCTS.slice(start, end).map((p) => ({
    url: `https://example.com/products/${p.slug}`,
    lastModified: new Date(p.updatedAt),
  }));
}

Sitemap and robots.txt

Reference the sitemap from robots.txt so other crawlers find it. Next.js has `app/robots.ts` for this:

// app/robots.ts
import type { MetadataRoute } from "next";

export default function robots(): MetadataRoute.Robots {
  return {
    rules: { userAgent: "*", allow: "/", disallow: ["/admin/", "/api/"] },
    sitemap: "https://frontendhorizon.com/sitemap.xml",
  };
}

Common mistakes that hurt indexing

  • Including non-canonical URLs (e.g., URLs with query strings, paginated URLs). Only include canonical URLs in the sitemap.
  • Including pages with `noindex` directives. Google will see the contradiction and ignore the sitemap entry.
  • Including pages that return 404s. The sitemap’s job is to tell Google about pages that exist.
  • Including pages blocked in robots.txt. Pick one or the other — block in robots OR include in sitemap, not both.
  • Forgetting to update the sitemap after a site rebuild. We’ve seen this 3 times; the rebuild changes URL patterns and the old sitemap is stale.

Multiple sitemaps for different content types

Some teams split sitemaps by content type: `sitemap-blog.xml`, `sitemap-products.xml`, `sitemap-locations.xml`. This is fine; it makes the GSC Coverage report easier to read because indexed counts are grouped by content type. We do it for sites with more than 5,000 URLs.

Dynamic generation and ISR

Next.js generates the sitemap at build time by default. If you publish blog posts via static generation, the sitemap rebuilds on every deploy. If you publish content out-of-band (a CMS, a database write), the sitemap needs ISR to pick up new entries without a redeploy.

// Force ISR on the sitemap
export const revalidate = 3600;  // re-generate hourly

News sitemaps and video sitemaps

If you publish news content, Google supports a special news sitemap format with extra metadata (publication date, title). If you publish video, a video sitemap helps Google index thumbnails and durations. Most SMB sites need neither. We’ve added a news sitemap exactly once for a hyperlocal news client.

Verifying it’s working

  1. Fetch your sitemap URL in a browser — should return XML.
  2. GSC → Sitemaps → status should be ‘Success’ and ‘Discovered URLs’ should match your expected page count.
  3. GSC → Pages → Indexed should grow toward your sitemap URL count over the next week or two.
  4. If indexed count stays well below sitemap count, look at the Coverage report’s Excluded reasons for why.

How this lands across FH client work

Every FH client site ships with an auto-generated `app/sitemap.ts` that pulls from the same typed data sources the site renders from. No drift between published pages and sitemap. No manual maintenance. Submitted to GSC on day one of every launch. If your site’s sitemap is hand-maintained or missing entirely, book a consultation — we’ll add the auto-generation pattern in a half-day engagement.

Written by
John Cravey
Founder

Founder of Frontend Horizon. Writes most of the long-form work on the FH blog.

Newer post
View Transitions API and CSS Scroll-Driven Animations: The Browser Wins of 2026
Older post
AI Chat for Customer Service on SMB Sites: When It Helps and When It Hurts
Keep reading

More from the blog

Search Console·4 min

Schema Markup for SMB Sites: The Three Types That Actually Help

Schema is one of the highest-ROI SEO investments. Three types cover 90% of the value.

Search Console·6 min

Google Search Console: From Zero to Actionable in an Afternoon

GSC is free and tells you exactly what Google sees on your site. Most SMBs never look at it. Here’s how we set it up and what we check.

Search Console·5 min

Google Search Console Performance Report: Reading the Data Without Lying to Yourself

Most teams misread the Performance report the same five ways. Here’s the honest read.