ServicesAI Audit
← Back to Blog

Common Schema Markup Mistakes That Hurt AI Visibility

schema markupAI visibilityJSON-LDstructured dataLLM SEOAI searchschema mistakesShopify SEOWordPress SEO
Close-up of HTML code highlighted in vibrant colors on a computer monitor.

Schema markup errors are surprisingly easy to make and surprisingly costly to ignore. Most site owners who implement structured data do so once, pat themselves on the back, and never revisit it. Meanwhile, AI search engines like ChatGPT, Perplexity, and Gemini are crawling their pages, finding incomplete or contradictory markup, and quietly choosing someone else's content to cite instead.

This guide runs through the most common schema markup mistakes that genuinely hurt AI visibility, not just abstract technical sins, but the specific issues that show up again and again when auditing real sites.

Implementing Schema Once and Never Updating It

This is probably the single most widespread problem. A developer adds JSON-LD to a site during a build, it passes a validation check, and nobody touches it again for two years. By then, the business has changed its prices, added new services, rebranded, or shifted its target audience. The schema still describes the old version of the business.

AI models train on crawled data, and they also pull live information when generating answers. If your structured data says your business is a "freelance graphic design studio" but your site now clearly positions you as a "brand identity agency for SaaS companies," there is a mismatch. AI systems pick up on that inconsistency and tend to deprioritise ambiguous sources.

The fix is simple: treat schema like any other content. When you update your services page, update the schema. When you change your pricing model, update the schema. Set a calendar reminder to audit your structured data at least quarterly. It takes less time than you think.

Using the Wrong Schema Type for the Page

Applying a generic Organization schema to every page on your site is a very common shortcut. It is better than nothing, but it leaves a lot of signal on the table. AI search engines are trying to understand what a specific page is about. A generic organisation-level schema does not tell them whether they are looking at a product listing, a how-to guide, a service page, or a FAQ.

Here are some mismatches that come up repeatedly:

  • Using Article on a product page instead of Product
  • Using WebPage on a service page instead of Service or ProfessionalService
  • Using BlogPosting on a resource that is more accurately a HowTo or FAQPage
  • Using LocalBusiness on a page that primarily serves a national or international audience

Getting this right matters more for AI visibility than it ever did for traditional SEO. Google's featured snippets had some tolerance for ambiguity. AI models generating conversational answers are far more reliant on the semantic signals in your structured data to categorise and retrieve your content correctly.

If you run a service business, using ProfessionalService schema correctly can meaningfully improve how often AI engines surface you in relevant queries.

Missing Required and Recommended Properties

Schema.org distinguishes between required properties, recommended properties, and optional ones. A lot of implementations include the bare minimum required fields and stop there. That means the markup validates, but it is thin.

For Product schema, required fields include name, but recommended fields like offers, aggregateRating, brand, and sku are what AI shopping engines actually use to populate answers. If those fields are missing, your product is far less likely to appear in AI-generated shopping recommendations.

The same logic applies across schema types. FAQPage without well-written acceptedAnswer text. HowTo without clear step descriptions. Event without startDate and location. These are not just missed opportunities for rich results in Google. They are missed opportunities to give AI systems the structured context they need to cite you confidently.

A practical approach: for each schema type you implement, look at both the required and recommended properties on schema.org and fill in everything that is genuinely applicable to your content. Do not pad it with invented or inaccurate data, but do not leave obviously relevant fields blank either.

Contradictions Between Schema and On-Page Content

This one can genuinely tank your AI visibility. It happens when the structured data says one thing and the visible page content says something different. Common examples include:

  • Schema lists a price of £49, but the page shows £59 after a recent update
  • Schema lists an author called "Admin" but the page byline shows a real person's name
  • Schema describes the business as being in London, but the contact page gives a Manchester address
  • Schema claims 4.8-star reviews, but the on-page review widget shows 3.9

AI systems cross-reference structured data against visible content. When they find inconsistencies, they flag the source as less reliable. This is essentially the same logic that underpins E-E-A-T, the idea that trustworthy sources are internally consistent and accurate. Contradictory schema is a clear signal of unreliability, even if the error was purely accidental.

The solution is to audit schema and on-page content together, not separately. If you use a plugin or automation to generate schema dynamically from your CMS fields, make sure those fields are actually being kept up to date. Static JSON-LD blocks that were copied and hardcoded years ago are the biggest culprits here.

Blocking AI Crawlers in robots.txt

You cannot benefit from AI visibility if AI crawlers cannot read your pages in the first place. A surprisingly high number of sites have robots.txt rules, often inherited from old configurations or added by developers trying to block scrapers, that inadvertently block GPTBot, ClaudeBot, or other AI crawlers.

This makes your schema irrelevant because the crawlers never see it. It is worth checking your robots.txt file explicitly for any rules that might be blocking AI bots, whether through wildcard disallows or specific user agent entries. Many sites block AI crawlers without realising it, and the owners only find out when they notice they are never cited despite having solid content.

If you are not sure which AI crawlers are actually visiting your site, checking your server logs is the most reliable method. You can identify bots by their user agent strings and then verify whether your robots.txt is allowing or denying them.

Duplicate Schema Blocks on the Same Page

This happens most often on WordPress sites where a theme adds its own schema, a plugin adds another layer, and then someone manually adds a third block in the page editor. The result is multiple conflicting JSON-LD blocks on the same page, sometimes describing the same entity with different data.

Duplicate schema does not just confuse Google. It confuses AI crawlers too. When a crawler finds two Product blocks on the same page with different prices or two Organization blocks with different names, it cannot determine which one is authoritative. The safest response is to treat the page as an unreliable source.

To fix this, audit each page's source code for multiple <script type="application/ld+json"> blocks. Consolidate where possible, and make sure your CMS, theme, and any plugins are not generating overlapping schema. On Shopify, this often means disabling the theme's built-in schema output if you are managing structured data through a separate app or custom liquid.

Ignoring Schema for Reviews and Ratings

Customer reviews are one of the strongest trust signals available to AI systems. If you have genuine reviews and you are not surfacing them through aggregateRating or Review schema, you are hiding one of your best assets from the systems that would use it to recommend you.

AI search engines increasingly use social proof signals to decide which businesses to recommend. A query like "best accountant for freelancers in Bristol" is not just answered by who has the best on-page content. It is influenced by structured review data, citation patterns, and trust signals across the web. Getting your reviews into schema is a relatively quick win. Reviews influence AI recommendations more than most site owners realise.

Skipping FAQ Schema on Content That Answers Questions

AI search engines love FAQ-style content. It maps almost perfectly to the question-and-answer format that conversational AI uses to generate responses. Yet a huge proportion of pages that clearly answer specific questions do not use FAQPage schema to signal this to crawlers.

If a page on your site answers a question, even as a section within a longer article, adding FAQPage or QAPage schema to that content gives AI systems a structured signal that this is a reliable, ready-to-cite answer. Without the schema, the crawler has to infer this from the prose alone, which is less reliable.

This is especially worth doing on service pages, pricing pages, and any blog posts that address specific user questions. The FAQ section of this very post, for example, should have FAQPage schema applied to it.

Not Validating After Making Changes

Schema that was valid when first published can break after a CMS update, a theme change, or a developer editing a template. Broken JSON-LD, whether a missing closing bracket, an invalid property name, or an incorrect URL format, renders the schema useless. Worse, it can sometimes cause rendering errors that affect the page more broadly.

Validation should be part of your publishing workflow, not a one-off task. Google's Rich Results Test and Schema.org's validator are both free and take about thirty seconds to run. If you are managing schema at scale, consider setting up automated testing as part of your deployment pipeline. At FlinnSchema, structured data validation is built into the implementation process specifically because silent breakages are so common and so easy to miss.

Frequently Asked Questions

Does bad schema markup actively hurt my rankings, or does it just fail to help?

It can do both. Schema that contradicts your on-page content is an active trust signal problem. Schema that is simply missing or thin is more of a missed opportunity. For AI visibility specifically, contradictions and inaccuracies are penalising rather than neutral, because AI systems are actively assessing source reliability when deciding what to cite.

How do I find out if my schema has errors right now?

Start with Google's Rich Results Test at search.google.com/test/rich-results. Paste in any URL and it will show you which schema types were found, which passed, and which have errors or warnings. For a broader view across your whole site, Google Search Console has a "Enhancements" section that flags schema issues at scale. You can also request a free AI visibility audit to get a full picture of how your structured data is performing.

Is it worth adding schema to every page, or just the important ones?

Prioritise pages that answer specific questions, describe products or services, or carry review data. A homepage, a key service page, and your top-performing blog posts are where schema has the most impact. Generic pages like a privacy policy or a terms page are lower priority. Quality and accuracy matter more than coverage, so it is better to have precise schema on twenty pages than vague schema on two hundred.

How often should I review my schema markup?

A quarterly review is a sensible minimum for most businesses. Any time you update pricing, services, personnel, or business details, you should update the relevant schema at the same time. If you run a product catalogue, you should ideally have dynamic schema that updates automatically when product data changes, rather than relying on static blocks. There is more detail on review frequency in this guide to schema update schedules.

Want to check your AI visibility?

Run a free audit on your website and see how visible you are to ChatGPT, Perplexity, and other AI search engines.

Run Free Audit