Why AI Search Engines Struggle to Recognise News Content Without Schema
AI-powered search engines like ChatGPT, Perplexity, and Gemini don't read your articles the way a human journalist would. They parse signals. They look for structured, machine-readable evidence that a piece of content is a legitimate news article, published by a real organisation, with a known author and a clear publication date. Without that evidence, your article looks the same as a random blog post, a product page, or a forum comment.
That's the core problem NewsArticle schema solves. It gives AI systems the metadata they need to confidently attribute, cite, and surface your journalism in news-related answer boxes. If you're publishing timely, factual content and not using NewsArticle schema, you're leaving citations on the table.
The good news is that implementing it correctly is not especially difficult. But doing it well, in a way that actually signals authority to AI systems, takes a bit more thought than simply pasting in a template.
NewsArticle Schema vs Article Schema: Understanding the Difference
Before writing a single line of JSON-LD, it's worth being clear on when to use NewsArticle versus the more general Article type.
Article is the broad parent type. It covers blog posts, long-form editorial, opinion pieces, and evergreen how-to content. If you're writing a guide that will remain relevant for years, Article is typically the right choice.
NewsArticle is a specific subtype of Article designed for content that reports on recent events, breaking news, or time-sensitive topics. The key signals here are recency and factual reporting. Google's documentation describes it as "an article whose content reports news, or provides background context and supporting materials for understanding the news."
There's also ReportageNewsArticle, a more granular subtype for first-hand journalistic reporting. For most publishers, NewsArticle covers the ground they need.
The distinction matters for AI systems because they use the schema type to inform how they categorise your content. A NewsArticle is more likely to be pulled into time-sensitive answer generation. An Article is more likely to appear as general reference. Getting this wrong means your breaking news story competes with evergreen blog content in AI retrieval, rather than being flagged as fresh, citable journalism.
For more on how the general Article type works in AI citation contexts, see our post on how to use Article schema to get your blog posts cited by AI.
The Anatomy of a Well-Formed NewsArticle JSON-LD Block
Here's a complete, production-ready NewsArticle schema block with every property that meaningfully influences AI visibility:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "NewsArticle",
"headline": "UK Inflation Falls to 2.3% as Energy Bills Drop",
"description": "The UK's annual inflation rate dropped to 2.3% in April 2025, driven by falling household energy costs, according to the Office for National Statistics.",
"datePublished": "2025-05-15T09:00:00+01:00",
"dateModified": "2025-05-15T11:30:00+01:00",
"author": {
"@type": "Person",
"name": "Sarah Malone",
"url": "https://yoursite.com/authors/sarah-malone",
"sameAs": [
"https://twitter.com/sarahmalone",
"https://www.linkedin.com/in/sarahmalone"
]
},
"publisher": {
"@type": "Organization",
"name": "The Finance Observer",
"url": "https://yoursite.com",
"logo": {
"@type": "ImageObject",
"url": "https://yoursite.com/logo.png",
"width": 600,
"height": 60
}
},
"image": {
"@type": "ImageObject",
"url": "https://yoursite.com/images/inflation-april-2025.jpg",
"width": 1200,
"height": 630
},
"articleSection": "Economics",
"keywords": "inflation, UK economy, ONS, energy bills",
"isAccessibleForFree": true,
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://yoursite.com/news/uk-inflation-april-2025"
}
}
</script>
Let's break down the properties that carry the most weight for AI citation.
headline
This should match your H1 exactly. AI systems compare the schema headline against the visible page heading. If they don't match, it creates a trust signal mismatch. Keep it under 110 characters.
datePublished and dateModified
Always use ISO 8601 format with timezone offset. This is non-negotiable for news. AI search engines prioritise recency in news answers, and they read the datePublished timestamp directly. A vague or missing date means your article may be deprioritised in favour of one that clearly signals "this was published three hours ago." Always include the time component, not just the date.
author with sameAs
This is where many publishers cut corners, and it costs them. An author name alone is weak. When you include sameAs links to the author's LinkedIn, Twitter/X profile, or a Wikipedia page, you're giving AI systems a way to verify that a real, identifiable person wrote the article. This directly influences whether your content is treated as authoritative. For deeper guidance on this, our post on using Person schema to build authority in AI search covers the full approach.
publisher with logo
The publisher Organisation entity ties your article to a known entity. The logo ImageObject isn't just decorative metadata; it's used in Google News carousels and helps AI systems cross-reference your organisation against their entity graphs. Make sure the logo URL resolves to an actual image.
isAccessibleForFree
Set this to true for open-access articles. If your content is paywalled, you'll need to use the hasPart property with isAccessibleForFree: false for the restricted sections. AI systems are less likely to cite content they can't access, so transparency here helps.
Where to Place the Schema on Your Page
Place the JSON-LD block inside the <head> of your HTML document. This is the fastest and cleanest approach. Some CMS platforms append it to the end of the <body>, which works but is slightly slower to parse.
Do not use Microdata or RDFa for NewsArticle. JSON-LD is the format Google recommends, and it's also what AI crawlers handle most reliably. Inline Microdata is harder for AI systems to extract cleanly from noisy HTML.
If you're on WordPress with a plugin like Yoast or Rank Math, be aware that these tools often output a generic Article type rather than NewsArticle. You may need to modify the output or inject a custom block. Our post on how to add JSON-LD to WordPress without a plugin walks through exactly how to do this without touching your theme's core files.
Common Mistakes That Stop AI Systems From Citing Your News Content
Getting the schema in place is step one. But there are several mistakes that quietly undermine your AI citation potential even when the markup is technically valid.
Using a static datePublished when you update articles
If you regularly update articles without changing the datePublished, AI systems may treat them as old content. Always update dateModified when you make meaningful edits, and consider whether a major update warrants a new article entirely.
Missing or low-resolution images
The image property should point to a high-quality image of at least 1200x630 pixels. AI-driven news features and rich results both prefer larger images. A missing image property is a red flag that the article lacks full publication metadata.
Generic or keyword-stuffed descriptions
The description field is not a meta description duplicate. Write a clear, factual summary of what the article reports. AI systems use this to understand article scope before deciding whether to cite it in response to a query. Treat it like the first sentence of a news wire summary.
No Organisation entity on the publisher's homepage
Your NewsArticle schema is stronger when it connects to a publisher Organisation that also has structured data on the homepage. If your site's homepage has no Organisation or WebSite schema, the publisher entity in your articles becomes harder for AI systems to verify. This is something the team at FlinnSchema regularly flags in AI visibility audits because it's a silent but significant gap.
Inconsistent author URLs
If your author pages change URL structure, or if you use different spellings of an author's name across articles, the entity graph breaks. AI systems build knowledge graphs partly by cross-referencing author entities across multiple pages. Consistency matters.
Testing and Validating Your NewsArticle Schema
Once you've implemented the markup, validate it with Google's Rich Results Test at search.google.com/test/rich-results. Enter your article URL and check that the tool recognises the NewsArticle type and shows no errors or warnings.
Also run it through Schema.org's validator at validator.schema.org to check for any property-level issues that Google's tool might not surface.
For AI-specific behaviour, Perplexity is actually a useful manual test. Search for the specific topic your article covers and see whether your site appears in citations. If it doesn't, and your content is clearly relevant and recent, schema gaps are usually one of the first places to investigate.
One thing worth noting: validation tools confirm technical correctness, not AI citation potential. A schema that passes validation can still fail to earn citations if the underlying content lacks authority signals. Schema markup is the packaging; the content itself still needs to be worth citing.
NewsArticle Schema for Different Publishing Setups
WordPress
Neither Yoast nor Rank Math outputs NewsArticle type natively for most configurations. You'll typically need to either filter their output via PHP, use a dedicated news schema plugin, or inject a custom JSON-LD block using a functions.php snippet or a lightweight plugin like "Insert Headers and Footers." The approach we recommend depends on whether you need this at scale across thousands of articles or just for a handful of key pieces.
Shopify
Shopify's blog posts don't support NewsArticle schema out of the box. If you publish news-style content in your Shopify blog, you'll need to edit the article.liquid template directly to inject a JSON-LD block, or use a third-party app. This is worth doing if your Shopify blog publishes timely content like product launches, industry news, or press coverage.
Headless and custom-built sites
For headless setups (Next.js, Nuxt, etc.), inject the JSON-LD block server-side in the page <head> using a component or layout wrapper. Avoid rendering it client-side only via JavaScript, as some AI crawlers don't execute JS fully during indexing.
Frequently Asked Questions
Does NewsArticle schema guarantee that ChatGPT or Perplexity will cite my article?
No. Schema markup improves your chances by making your content easier for AI systems to parse and attribute, but citation decisions depend on many factors including content quality, domain authority, recency, and how well your article answers the query. Think of schema as a strong signal, not a guarantee.
Can I use NewsArticle schema on opinion pieces or editorials?
Technically yes, but it's not always the best fit. NewsArticle is intended for factual reporting on current events. For opinion or commentary, the OpinionNewsArticle subtype is more accurate. For evergreen editorial content, stick with the plain Article type to avoid misleading AI systems about the nature of the content.
How often should I update my NewsArticle schema?
Update the dateModified field every time you make a meaningful change to the article content. You don't need to update schema for minor typo fixes, but any factual update, new quote, or added context warrants a timestamp change. This signals freshness to AI crawlers that revisit indexed pages.
Does having NewsArticle schema help with Google News inclusion?
It's one of several requirements. Google News also looks at your site's overall publishing frequency, whether you have a dedicated news section, byline policies, and whether your content meets their news quality guidelines. Schema alone won't get you into Google News, but it's part of the technical foundation you need. Once you're in Google News, your content becomes more accessible to AI systems that draw on that index.

