Why FAQ schema behaves differently for AI than for Google
Most people test their FAQ schema using Google's Rich Results Test and stop there. They see a green tick, assume everything is working, and move on. The problem is that passing Google's validator tells you almost nothing about whether AI search engines like ChatGPT, Perplexity, or Gemini are actually reading and using your structured data.
Google uses FAQ schema to display accordions in search results. AI search engines use it differently. They are looking for clearly structured, machine-readable question-and-answer pairs that help them understand what your page is about and whether it can be cited as a reliable source. The bar is different. The signals they weight are different. And critically, the testing approach needs to be different too.
This post walks you through a practical testing process, from the basics of syntax validation all the way through to behavioural testing with live AI tools.
Start with syntax validation, but do not stop there
The first check is always syntax. Broken JSON-LD will not be read by anyone, human or machine. There are three tools worth using at this stage.
Google's Rich Results Test
Go to Google's Rich Results Test and paste in your URL or raw code. It will confirm whether your FAQPage schema is valid, whether the questions and answers are being detected, and whether there are any parsing errors. A page with two to ten Q&A pairs is typically eligible for rich result display. If Google cannot parse it, neither can most AI crawlers.
Schema.org Validator
The Schema.org Validator is less well known but more thorough for structural accuracy. It checks whether your markup conforms to the official schema.org vocabulary, flags incorrect property names, and highlights missing recommended properties. For FAQ schema, this means confirming you have a proper FAQPage type with mainEntity containing Question items, each with an acceptedAnswer that holds a clean Answer object and a well-formed text property.
Manually inspect the rendered HTML
This step catches a common issue that validators miss: your schema might exist in the source HTML but not in the rendered output. AI crawlers typically execute JavaScript to a limited degree, but some do not execute it at all. If your FAQ schema is injected dynamically via a client-side script, there is a real risk it is invisible to certain crawlers.
Use Chrome DevTools, open the Network tab, and view the page source rather than the inspector. Look for your <script type="application/ld+json"> block. If it is missing from the raw source and only appears after JavaScript execution, you have a rendering problem worth fixing. Server-side rendering of your JSON-LD is strongly preferable.
Check whether AI crawlers are actually visiting your pages
Before worrying about whether your schema is being read correctly, confirm that AI crawlers are reaching your pages at all. There is no point optimising for an audience that is being blocked at the door.
Pull your server access logs and filter for known AI crawler user agents. The ones to watch for include GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended (Gemini), and Applebot-Extended. If none of these appear in your logs over a rolling 30-day window, something is likely blocking them, whether that is your robots.txt file, a firewall rule, or a Cloudflare bot management setting.
Your robots.txt file is the most common culprit. A wildcard Disallow: / rule under User-agent: * will block AI crawlers unless they are explicitly exempted. Many site owners discover they have been blocking AI crawlers without realising it, often because a developer added a blanket rule years ago and nobody revisited it.
If you want a structured way to check access, this guide on checking which AI crawlers are visiting your site covers the log analysis process in detail.
Test how AI search engines respond to your content directly
This is the most revealing test and the one most people skip. Once you have confirmed your schema is syntactically valid and your crawlers are not blocked, the real question is: does the AI actually use your content when answering relevant questions?
Test with Perplexity
Perplexity is the easiest tool for this because it shows its sources inline. Go to Perplexity.ai and ask a question that closely matches one of the questions in your FAQ schema. If your page is indexed and being used, it should appear as a cited source. If it does not appear, that does not necessarily mean your schema is broken, but it does mean you are not being surfaced for that query. Try three or four variants of the question before drawing conclusions.
Test with ChatGPT (Browse mode)
In ChatGPT, switch to the Browse with Bing mode (available in GPT-4 and GPT-4o). Ask a question that mirrors your FAQ content and see whether your domain is cited. This tests Bing's index rather than OpenAI's training data directly, but it is a useful proxy for real-world citation behaviour. A page with clean FAQ schema, clear answers, and a good content structure is more likely to be selected as a source than a page with bare-bones text.
Test with Google's AI Overviews
If you are in a market where Google's AI Overviews are active, search for your FAQ questions directly in Google. If your FAQ schema is well-structured and your page has sufficient authority, Google may pull your answer directly into the AI Overview panel. Seeing your content appear here is a strong signal that your structured data is being processed correctly.
Look for specific schema quality issues that hurt AI readability
Passing a validator does not mean your FAQ schema is optimised for AI citation. There are several quality issues that are technically valid but functionally weak.
Answers that are too short
Google's FAQ rich results can work with very brief answers. AI systems prefer substance. An acceptedAnswer with a single sentence gives an AI model almost nothing to work with. Aim for answers of at least three to five sentences, written in plain, declarative language. Each answer should stand alone as a complete, self-contained response, because that is exactly how AI engines will use it.
Questions that do not match real search intent
If your FAQ questions are written for SEO keyword stuffing rather than genuine user queries, AI models will deprioritise them. Questions like "What is the best premium quality affordable solution for X?" do not reflect how people actually ask things. Write questions the way a real person would type them into a search bar or speak them to a voice assistant.
HTML inside the answer text
Some implementations include HTML tags inside the text property of the Answer object. Technically this can work, but plain text is safer and more universally readable across AI crawlers. Strip out any formatting tags from your answer text values and keep the content clean.
Too many FAQ items on one page
Google's guidelines historically suggest a cap of around ten FAQ items per page for rich results. For AI readability, there is a different concern: pages with 30 or 40 FAQ items often lack depth on any individual question. A focused set of five to eight genuinely well-answered questions will outperform a sprawling list of shallow ones.
Use a structured monitoring approach rather than one-off checks
Schema testing should not be a one-time activity. Pages change. Plugins update. Templates get rebuilt. A schema implementation that worked perfectly in January can break silently by March if a theme update strips your JSON-LD block or a new page builder re-renders the HTML differently.
Set up a monthly review cycle. At minimum, run your key pages through the Rich Results Test and Schema.org Validator once a month. If you have access to Google Search Console, monitor the "Enhancements" section for FAQs, which will surface any new errors or warnings. And keep a record of your AI citation tests so you can spot trends over time rather than reacting to problems after they have already cost you visibility.
At FlinnSchema, we build ongoing schema monitoring into our client work precisely because silent breakages are so common, and because catching a problem in week two is far better than discovering it six months later when you are wondering why AI referrals have dried up. If you want a second pair of eyes on your current setup, an AI visibility audit is a good starting point.
It is also worth understanding how often you should update your schema markup more broadly, since FAQ schema sits within a wider structured data strategy that benefits from regular attention.
A quick checklist for FAQ schema AI readiness
To summarise the testing process in practical terms, here is what a solid check looks like:
- JSON-LD passes Google's Rich Results Test with no errors
- Schema.org Validator confirms correct FAQPage structure with proper Question and Answer nesting
- JSON-LD is present in raw page source, not dependent on JavaScript rendering
- robots.txt is not blocking GPTBot, ClaudeBot, PerplexityBot, or Google-Extended
- Server logs confirm AI crawlers have visited within the last 30 days
- Each answer contains at least three to five sentences of substantive content
- Questions are written in natural, conversational language
- No HTML tags inside answer text values
- Perplexity or ChatGPT Browse cites your page when asked matching questions
- Monthly re-checks are scheduled to catch silent breakages
This list is not exhaustive, but running through it will catch the vast majority of issues that cause FAQ schema to underperform with AI search engines.
Frequently Asked Questions
Does passing Google's Rich Results Test mean AI search engines can read my FAQ schema?
Not necessarily. Google's Rich Results Test confirms syntactic validity and eligibility for Google's own rich result features. AI search engines like ChatGPT and Perplexity have their own crawlers and their own criteria for what makes content citation-worthy. Passing the validator is a necessary first step, not a complete answer. You still need to confirm crawlers are not blocked, that your schema renders in the raw HTML, and that your answers are substantive enough to be useful to an AI model.
How can I tell if Perplexity is reading my FAQ schema specifically?
You cannot directly observe Perplexity's schema parsing, but you can test the outcome. Ask Perplexity a question that matches one of your FAQ items and check whether your page is cited as a source. If it is, your page is being indexed and considered relevant. If it is not, check your PerplexityBot entry in your server logs and your robots.txt rules to confirm access is not being blocked.
What is the most common reason FAQ schema fails to get cited by AI?
The single most common issue is answers that are too thin. A one-sentence answer might earn a Google FAQ rich result, but AI models are looking for depth and clarity. The second most common issue is crawlers being blocked, often by an overly broad robots.txt rule. Between those two problems, most FAQ schema underperformance can be explained and fixed.
Should I use FAQPage schema on every page of my site?
No. FAQPage schema should only be used on pages where the primary content is a list of questions and answers. Adding it to product pages, landing pages, or blog posts that happen to mention a few questions is technically valid but considered bad practice and can lead to Google ignoring it. Use it selectively, on pages where it accurately describes the content structure, and invest in making those pages genuinely useful rather than spreading thin FAQ blocks across your entire site.

