Why video content gets ignored by AI search engines
AI search engines like ChatGPT, Perplexity, and Gemini are fundamentally text-first systems. They read, process, and cite written content. Video, by its nature, is not directly readable - a crawler cannot watch a three-minute YouTube tutorial and extract structured information from it the way it can parse a well-written article.
This creates a real problem for brands that have invested heavily in video content. You might have a brilliant product walkthrough, a detailed how-to series, or an expert interview that would be genuinely useful to someone asking a question on Perplexity. But if your video page gives an AI crawler nothing to read beyond a title and an embed, it simply cannot cite it.
VideoObject schema is the fix. It is a structured data format that wraps your video content in machine-readable metadata: the name, description, transcript, duration, thumbnail, upload date, and more. When implemented properly, it transforms a video page from a dead end into a source an AI engine can actually use.
What VideoObject schema actually contains
VideoObject is part of the Schema.org vocabulary. It sits within the CreativeWork hierarchy and tells search systems the essential facts about a video. Here are the properties that matter most for AI visibility:
name
The title of the video. Keep this specific and descriptive. "How to install a radiator valve in 5 steps" is far more useful to an AI than "Product tutorial #4".
description
This is one of the most important fields for AI citation. Write a genuine, detailed description of what the video covers. Think of it as a mini-article summary. Three to five sentences at minimum. AI systems pull from this field when generating answers, so treat it like content, not metadata.
uploadDate
ISO 8601 format: 2024-11-15. Freshness matters to AI engines. Perplexity in particular tends to prefer recently updated sources when answering time-sensitive questions.
duration
Also ISO 8601 format: PT4M30S for four minutes and thirty seconds. Not the most critical field for AI citations, but it signals completeness and professionalism in your markup.
thumbnailUrl
A direct URL to the video thumbnail image. Google requires this; AI systems also use it to confirm the resource exists and is properly published.
contentUrl and embedUrl
Include both where possible. contentUrl is the direct URL to the video file; embedUrl is the iframe embed address (e.g., the YouTube embed URL). AI crawlers that follow URLs will use these to verify the resource.
transcript
Officially, the Schema.org property for this is transcript, though it is not widely documented in Google's guidelines. Include it anyway. A full or partial transcript placed in this field gives AI crawlers a direct text version of your spoken content. This is the single biggest factor in whether a video page gets cited.
A working VideoObject JSON-LD example
Here is a clean, production-ready example you can adapt. This is the format to implement as a <script type="application/ld+json"> block in the <head> of your video page:
{
"@context": "https://schema.org",
"@type": "VideoObject",
"name": "How to Install a Thermostatic Radiator Valve: Step-by-Step Guide",
"description": "In this video, our lead engineer walks through the complete process of fitting a thermostatic radiator valve (TRV) on a standard UK central heating system. Covers the tools required, draining the system safely, fitting the valve body, and bleeding the radiator afterwards. Suitable for DIY homeowners with basic plumbing experience.",
"thumbnailUrl": "https://www.example.com/images/trv-install-thumbnail.jpg",
"uploadDate": "2024-10-03",
"duration": "PT6M45S",
"contentUrl": "https://www.example.com/videos/trv-install.mp4",
"embedUrl": "https://www.youtube.com/embed/abc123xyz",
"transcript": "Right, so today we are going to fit a thermostatic radiator valve. First things you will need: an adjustable spanner, a pair of grips, PTFE tape, and a towel. Start by turning off your boiler...",
"publisher": {
"@type": "Organization",
"name": "Example Heating Supplies",
"url": "https://www.example.com"
}
}
Notice how the description is written like a real paragraph, not a string of keywords. That is intentional. AI systems are language models - they respond to natural, informative text far better than keyword-stuffed summaries.
The transcript field is doing more work than you think
Let's be direct about this: the transcript is the most powerful element of your VideoObject schema for AI citation purposes. Here is why.
When Perplexity or ChatGPT processes a page, it looks for text it can quote and attribute. A video embed with a short description gives it almost nothing to work with. But a page that includes a full transcript, either in the schema markup, in visible page text, or ideally both, suddenly becomes a text-rich resource that an AI can extract specific answers from.
If you host videos on YouTube or Vimeo, you can often pull auto-generated transcripts from those platforms and clean them up. For shorter videos, it takes maybe 20 minutes to transcribe manually. That 20 minutes could be the difference between your video page being cited in AI answers and being completely invisible.
Place the transcript in two places: inside the transcript property in your JSON-LD, and as visible text on the page itself, formatted under a heading like "Video transcript". The visible text version helps with traditional SEO too. It is one of the clearest wins available for video-heavy sites.
Where to implement VideoObject schema
VideoObject schema belongs on any page where a video is the primary content. This includes:
- Dedicated video pages on your own site (e.g., a tutorial library or video blog)
- Product pages that feature a product demo video as a key element
- Blog posts built around an embedded video with supporting written content
- Landing pages with explainer videos, provided the video content is substantive
Do not implement VideoObject on pages where the video is purely decorative (a background loop on a homepage hero, for example). Schema markup should accurately reflect the page content. Inaccurate or misleading schema can actively damage your AI visibility, not just fail to help it.
You can read more about what happens when schema markup is implemented incorrectly in our post on what happens if your schema markup contains errors.
Combining VideoObject with other schema types
VideoObject does not have to live alone on a page. In fact, pairing it with other schema types makes the page significantly richer for AI crawlers.
VideoObject + HowTo
If your video is a step-by-step tutorial, combine VideoObject with HowTo schema on the same page. The HowTo schema gives the AI a structured list of steps it can cite directly; the VideoObject confirms the video resource exists and adds supporting detail. This combination is particularly powerful for how-to queries, which are among the most common AI search prompts.
We have a dedicated guide to using HowTo schema to get featured in AI step-by-step answers if you want to go deeper on that pairing.
VideoObject + FAQPage
If your video answers common questions (a product FAQ video, for example), wrap the same page in FAQPage schema as well. Write out the questions and answers as text, and let the VideoObject schema point to the video that covers them. The AI gets text it can cite immediately, and the video gets attributed as the source.
VideoObject + Organization
Always include a publisher property referencing your Organisation schema. This links the video back to your brand entity, which helps AI systems understand who produced the content and builds authority for future citations.
Common mistakes that kill AI visibility for video pages
After working with e-commerce and content brands on structured data implementation, a few patterns come up repeatedly as the reasons video pages fail to get cited.
Descriptions that are too short
A one-sentence description like "Watch our product demo" tells an AI nothing useful. Write at least 100 words in the description field. Describe what the viewer will learn, the specific topics covered, who it is aimed at, and what makes it useful. Think of it as the text version of your video.
Missing upload dates or wrong format
A missing uploadDate or one formatted as "November 2024" rather than 2024-11-01 will cause validation errors. AI systems and Google both require ISO 8601 formatting. Get this right from the start.
No supporting text on the page
A video embed with schema markup but no visible text content on the page is a weak signal. AI crawlers want to see the schema confirmed by visible content. A transcript, a written summary, or a detailed article accompanying the video all strengthen the page's citability significantly.
Implementing schema on YouTube pages instead of your own site
You cannot add VideoObject schema to a YouTube video page - Google controls that. Your schema needs to live on your own site, on a page you own. If you are embedding YouTube videos and not creating dedicated pages for them on your own domain, you are missing the opportunity entirely. Build a proper page for each video that matters.
How AI engines actually use VideoObject data
It helps to understand what happens on the other end. When a crawler like GPTBot or Perplexity's bot visits your video page, it reads the raw HTML including any JSON-LD blocks. The VideoObject schema gives it a clean, structured summary of the resource without having to parse messy paragraph text.
When a user then asks a question that your video addresses, the AI system matches the query to indexed content. Pages with structured data that clearly describe their content are far more likely to surface as relevant sources. The citation mechanism varies by platform: Perplexity lists sources directly; ChatGPT with browsing enabled cites URLs; Gemini attributes sources in its responses.
All of these systems benefit from the same thing: a clear, text-rich, well-structured page that tells them exactly what the content is about. VideoObject schema is the primary tool for achieving that with video content.
At FlinnSchema, we work with e-commerce and content brands to implement exactly this kind of structured data at scale, including VideoObject, HowTo, and a range of other schema types that directly improve AI search visibility. If you want to see how your current video pages are performing, our free AI visibility audit is a good place to start.
Frequently Asked Questions
Does VideoObject schema help with Google as well as AI search?
Yes. Google uses VideoObject schema to power video rich results in standard search, including the video carousel and key moments features. Implementing it properly gives you a double benefit: better visibility in traditional search results and a stronger signal for AI search engines. The transcript field is less prominent in Google's official documentation, but the core fields (name, description, thumbnailUrl, uploadDate) are well-documented and actively used.
Do I need to host the video on my own site for VideoObject schema to work?
No. You can embed a YouTube or Vimeo video and still implement VideoObject schema on your own page. Use the embedUrl field to point to the YouTube embed URL, and the contentUrl field to point to the YouTube watch URL or your own hosted file if you have one. The schema lives on your page, not on the video hosting platform.
How long should the transcript be in the schema markup?
As long as it needs to be. There is no upper limit on the transcript field. A full verbatim transcript is ideal, but even a detailed partial transcript covering the key points is far better than nothing. If your video is under ten minutes, aim for a complete transcript. For longer videos, cover at least the first half and the key sections that answer specific questions.
Can I use VideoObject schema on a page that has multiple videos?
Yes, but each video needs its own VideoObject block. Use an array of VideoObject entries in your JSON-LD, or implement separate script blocks for each. Avoid trying to describe multiple videos in a single VideoObject instance - that creates ambiguity and reduces the usefulness of the markup for AI systems trying to match specific content to specific queries.
