Skip to content

Schema Markup for Mid-Market Teams: Governing Structured Data Across a Large Site

Structured data is easy to add and hard to keep correct across a large site. The value is in the governance, not the snippet.

John Cravey with EleviFounder13 min read

Schema markup is structured data that tells search engines and AI answer engines what a page means, beyond what the raw HTML says. Done well, it earns rich results and gets your pages cited in answers. That much is the same at every company size. What changes at mid-market scale is everything around the snippet. You are not hand-writing three blocks of JSON-LD on a homepage. You are governing structured data across thousands of templated pages, several teams, and a martech stack that already exists. The hard part is not the schema. It is keeping the schema correct, consistent, and owned as the site grows and people change. This is how a 100 to 999 person marketing team runs structured data as a program rather than a one-time task.

Why the small-site advice does not survive contact with your site

The standard guidance is sound and worth reading: three schema types cover most of the value, and you can see the full pattern in the schema markup guide for smaller sites. Organization for the brand, LocalBusiness for physical locations, FAQPage for pages with question and answer sections. At a five-page site you paste those in once and you are done. At your scale, the same three types have to render correctly across product templates, location templates, article templates, and campaign landing pages, generated from a CMS or a database that other teams also write to. A hand-pasted block does not scale to ten thousand pages. A wrong hand-pasted block, copied into a template, ships the same error ten thousand times.

So the mid-market question is not "which three types." It is "where does the truth live, who owns it, how do we render it consistently, and how do we catch it when it drifts." Answer those four and the snippet writes itself. Skip them and you get exactly what we find in most mid-market audits: schema that was correct on the day it launched, quietly wrong eighteen months later, and nobody knows who owns it.

The three types that carry the value, at scale

The core types do not change. What changes is that each one now has a source of truth behind it and a template that renders it, not a person who types it.

Organization: one record, rendered everywhere

One Organization entity identifies the company: legal name, logo, canonical URL, social profiles, contact points. It feeds the Knowledge Panel and ties every page on the domain to a verified entity. At your scale the trap is not writing it, it is duplication. Different teams add their own Organization block on their own templates, the details drift, and search engines see three slightly different companies. Define it once, store it in one place, render it from the root layout, and forbid any other team from redefining it. This is a governance rule, not a code rule.

LocalBusiness or the vertical subtype: driven off the location dataset

If you run physical locations or defined service areas, each one needs address, hours, phone, and geo coordinates. At a hundred locations you do not hand-write a hundred blocks. You render one template against your location dataset, the same dataset that already powers your store locator. Use the specific subtype where it fits, and reference the official types from the Schema.org vocabulary rather than guessing at field names. The value here is entirely in the pipe: change a store's hours in the system of record, and the schema, the locator, and the page copy all update together. No drift, because there is only one source.

FAQPage: templated, but only where the questions are really on the page

Service pages, product pages, and support pages usually carry question and answer sections. Wrapping them in FAQPage schema makes them eligible for structured snippets and, increasingly, for citation in AI answers. The mid-market discipline is a hard rule that the schema is generated from the same content the user sees, never authored separately. If a content editor removes a question from the visible page but the schema still lists it, you have shipped a mismatch that suppresses the rich result and, in severe cases, invites a manual action. Generate FAQ schema from the rendered FAQ component, so the two can never disagree.

{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "Organization",
      "@id": "https://company.com/#org",
      "name": "Company, Inc.",
      "url": "https://company.com",
      "logo": "https://company.com/logo.png"
    },
    {
      "@type": "LocalBusiness",
      "@id": "https://company.com/locations/dallas#loc",
      "parentOrganization": { "@id": "https://company.com/#org" },
      "name": "Company, Inc. Dallas",
      "telephone": "+1-214-555-0100"
    },
    {
      "@type": "FAQPage",
      "@id": "https://company.com/services/x#faq",
      "mainEntity": [
        {
          "@type": "Question",
          "name": "Do you serve enterprise accounts?",
          "acceptedAnswer": { "@type": "Answer", "text": "Yes." }
        }
      ]
    }
  ]
}
One connected graph rendered from the source of truth, not three orphan blocks pasted per team.

Ownership: name it before you write a line of JSON-LD

The single most common failure at your scale is not a bad snippet. It is unclear ownership. Schema sits between marketing, who cares about the rich result, engineering, who renders it, and often a data or platform team, who owns the CMS and the location dataset. When nobody owns it end to end, it rots. Assign ownership explicitly, in writing, before the first block ships.

  • Marketing or SEO owns the intent: which types to implement, which pages, and what the schema should claim. They hold the priority list and the business case.
  • Engineering owns the rendering: the templates, the JSON-LD injection, and the build so it validates every deploy. They do not decide which types to add; they implement what SEO scopes.
  • The data or platform team owns the source of truth: the location dataset, the product catalog, the CMS fields the schema reads from. Schema is only as correct as the data behind it.
  • One named person owns the whole loop: they run the monthly Search Console check, triage errors to the right team, and keep the house standard current. Without this role, schema is everyone's job and therefore no one's.

The house standard: what "correct" means, written down

You cannot govern what you have not defined. A one-page internal standard is what lets any engineer, editor, or new hire ship structured data to the same bar without asking. It is the same discipline the agencies playbook applies across a client book, turned inward on your own large site. The standard should specify:

  1. Which schema types are approved, and which are explicitly out of scope, so teams do not add speculative types that add noise and validation burden.
  2. Where each type is rendered: root layout for Organization, location template for LocalBusiness, page component for FAQPage. One canonical place per type.
  3. The source of truth each type reads from, named by system, so nobody hand-authors data that already lives somewhere.
  4. The validation gate every schema change must pass before it ships, so a broken block cannot reach production.
  5. The review cadence and the owner of each check, so drift is caught on a schedule, not by accident.

This document is boring and it is the single most valuable artifact in the whole program. It turns schema from tribal knowledge in one person's head into a repeatable standard the organization owns. When that person leaves, the standard stays.

Integrating with the stack you already have

You are not starting from a blank site. You have a CMS, an analytics stack, probably a tag manager, and a search console property or several. The mistake mid-market teams make is bolting schema on as a separate tool or, worse, injecting it through the tag manager where it is invisible to engineering and to the build. Structured data belongs in the server-rendered HTML, generated by the same templates that render the page, reading from the same data the page reads. That way it is versioned in the codebase, reviewable in a pull request, and validated in the build.

  • Render schema server-side, in the page template, not client-side through a tag manager. Injected-after-load schema is fragile and some engines do not wait for it.
  • Read from systems of record you already run: the product catalog, the location dataset, the CMS author fields. Do not create a parallel data store for schema.
  • Version it with the code. Schema changes go through the same review and deploy path as any other template change, so they are auditable.
  • Keep the tag manager for what it is good at, which is marketing tags, not for structured data that needs to be in the initial HTML.

This is also where risk and compliance quietly matter. Schema makes machine-readable claims about your company: hours, prices, credentials, service areas. In regulated verticals, a wrong claim in structured data is a wrong claim, full stop. Reading it from the governed system of record, rather than from a marketer's copy-paste, is what keeps the schema honest. That is a compliance argument, and it is one leadership understands.

Governing with Search Console at scale

Google Search Console has an Enhancements section that reports valid items, warnings, and errors for each schema type you implement. On a large site this is your control panel. The Google guidance in the structured data documentation defines what each type needs; Search Console tells you where you are failing it. The governance move is to make this a scheduled, owned review, not a thing someone glances at when traffic drops.

  1. Monthly, the named owner pulls the Enhancements report for each implemented type and records the valid, warning, and error counts.
  2. Errors triage to the owning team by class: a data error goes to the platform team, a template error goes to engineering, a content mismatch goes to the editorial owner.
  3. A rising error count on one type is an early signal that a template changed or a data feed drifted. Catch it at ten pages, not at ten thousand.
  4. Validate individual changes against the rich results test before they merge, so the monthly report trends toward zero rather than being a cleanup queue.

What to skip, and why skipping is a governance decision

At small scale you skip the exotic schema types to save time. At your scale you skip them to protect the validation surface. Every type you implement is a type you now have to keep correct across every template forever. Adding Article schema to ten thousand posts, or Event schema you host twice a year, or self-serving Review markup that Google disallows, buys almost no rich results and adds real ongoing validation burden. Restraint is the mid-market discipline.

  • Article schema on every post: search engines handle this fine from meta tags, and it rarely earns a visible rich result, so it is validation cost with little return.
  • Product and Offer schema unless you genuinely sell products with prices and can keep the price data accurate. Wrong prices in schema are worse than no schema.
  • Review schema in first-party reviews: Google disallows self-serving review markup, and shipping it invites a manual action.
  • Speculative types a single team wants: route them through the house standard's approval step, so the validation surface only grows when the value is real.

Defending the program to leadership

Structured data is invisible in a boardroom until you make it legible. The way to defend the program is to tie it to outcomes leadership already tracks, not to snippet counts. Report three things and repeat them every quarter.

  • Rich result coverage: what share of eligible pages actually show a rich result in search, trended over time. This is the visible surface area the program buys.
  • Click-through lift on pages that gained rich results versus those that did not, so the schema work maps to traffic, not just to valid items.
  • Error rate: the count of schema errors in Search Console, trended toward zero. A falling error rate is proof the governance is working, and it is the number that shows the program is under control.

That last number matters more than it looks. Leadership does not fund tactics; it funds control. A schema program that can show a low, falling error rate across a large site is demonstrating operational maturity, and that is an easier budget conversation than "we added FAQ markup." The same case-for-control framing applies whether you run this in-house or partner it out, and it is the through-line across every version of this playbook: the micro businesses version optimizes for the cheapest high-value snippet, the SMEs version builds a repeatable internal rollout, and yours governs it at scale.

Vendor management and the build-versus-partner call

At some point someone will pitch you a schema tool or an agency retainer. The mid-market lens on that decision is the same lens you apply to any vendor: does it own a part of the loop you genuinely cannot staff, and does it integrate with your systems of record rather than creating a parallel one. A tool that injects schema through a tag manager and stores its own copy of your product data is a liability, not a solution, because it becomes a second source of truth that will drift from the first. A partner who renders schema from your existing data, versions it in your codebase, and hands you the house standard and the governance cadence is worth paying for.

That is where a platform partner fits. Frontend Horizon builds structured data as a governed, template-driven program that reads from your systems of record and ships the ownership map and the monthly cadence alongside the code, so the program survives staff changes. If you would rather own the whole stack, everything above is the playbook. Either way, the strategic call, which pages matter and what they should claim, stays with your team, because that is the part no tool decides for you. See how we run this across professional services and where it sits in the full solution set.

A 90-day rollout for a large site

You cannot boil the ocean on a ten-thousand-page site. Sequence it so each phase produces a governed, defensible result.

  1. Days 1 to 30: govern before you build. Write the house standard, assign the RACI, and inventory what schema already exists and where it is wrong. Pick the source of truth for each type. Nothing ships yet; you are defining correct.
  2. Days 31 to 60: implement the three core types from templates that read the source of truth, add the build-time validation gate, and stand up the monthly Search Console review with a named owner.
  3. Days 61 to 90: extend to the remaining approved types through the standard, retire any orphan or duplicate blocks other teams added, and produce the first quarterly leadership report on coverage, lift, and error rate.

The order is deliberate. Governance first, then implementation, then reporting. If you implement before you govern, you ship the same errors at scale and spend the next quarter cleaning them up. Get the ownership and the standard right and the snippets are the easy part.

Schema markup at mid-market scale is a governance problem wearing a technical costume. The three types have not changed since the small-site version. What changed is that you have to keep them correct across thousands of pages, several teams, and a stack that predates you, and prove to leadership that it is under control. Define the source of truth, name the owners, write the standard, validate every deploy, and review on a cadence. Do that and structured data becomes a durable asset instead of a snippet that was right once. Run the estimator to see the governed deliverables and the ownership map we ship, or talk to us about running the program with your team.

Written by
John Cravey
Founder

Founder of Frontend Horizon. Writes most of the long-form work on the FH blog.

Newer post
Market Sizing for Agencies: Showing Clients Who Is Actually Searching
Older post
Schema Markup for SMEs: A Practical Structured-Data Rollout
Keep reading

More from the blog

Search Console·12 min

Google Search Console for Mid-Market Teams: Governing Search Data Across Properties

At mid-market scale the risk is not that nobody looks at Search Console. It is that ten people look at ten disconnected properties with no owner, no access control, and no shared read. Here is how to govern it.

Search Console·12 min

The Search Console Performance Report for Mid-Market Teams: A Defensible Baseline

When ten people can pull the same report and reach five different conclusions, you do not have a metric. You have a liability. Here is how to make the Performance report defensible.

Search Console·13 min

Sitemaps and Indexing for Mid-Market Teams: Index Coverage Across Many Pages

When you publish thousands of pages across several teams, the gap between what you published and what Google indexed becomes a real business number. Here is how to own it.