ServicesAI Audit
← Back to Blog

How to Use DefinedTermSet Schema to Get Cited in AI Glossaries

Schema MarkupJSON-LDAI VisibilityDefinedTermSetLLM SEOGlossary SEOStructured Data
Open vintage book with eyeglasses resting on a wooden surface, evoking a sense of nostalgia.

Why AI engines love glossary-style content

When someone asks ChatGPT "what is schema markup?" or Perplexity "what does JSON-LD mean?", those AI engines need to pull a definition from somewhere. They are not generating those answers from thin air. They are synthesising content from pages that have clearly defined, well-structured term-and-definition pairs. If your site has a glossary, a terminology page, or even a single definition, you are sitting on a citation opportunity that most e-commerce and content sites completely ignore.

Definitional queries are among the most common query types in AI search. That means glossary content, done correctly, is one of the highest-ROI investments you can make for AI visibility. The catch is that raw HTML glossaries are harder for AI crawlers to interpret confidently. A page full of terms and definitions written in flowing prose gives the machine less certainty about what is a term and what is an explanation. That is where DefinedTermSet schema comes in.

What DefinedTermSet and DefinedTerm actually are

DefinedTermSet is a Schema.org type that wraps a collection of defined terms. Think of it as the container. Each individual entry inside that container is a DefinedTerm, which carries the actual name and description of the concept being defined.

Both types live under the Schema.org vocabulary and are implemented via JSON-LD, the same way you would add Product, Article, or FAQPage schema. They are not widely used yet, which is exactly why adopting them now gives you an advantage.

Here is what the core properties look like:

  • DefinedTermSet properties: name, description, url, hasDefinedTerm
  • DefinedTerm properties: name, description, url, inDefinedTermSet, termCode

The hasDefinedTerm property on the set points to each individual DefinedTerm. The inDefinedTermSet on each term points back to the set. You can implement them together in a single JSON-LD block or separately per page. We will cover both approaches below.

Building the JSON-LD: a practical example

Say you run an e-commerce store that sells marketing software, and you have a glossary page explaining terms like "conversion rate", "bounce rate", and "click-through rate". Here is how you would mark up the whole set in one JSON-LD block:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "DefinedTermSet",
  "name": "Marketing Metrics Glossary",
  "description": "Definitions of key marketing performance metrics used in e-commerce analytics.",
  "url": "https://example.com/glossary/marketing-metrics",
  "hasDefinedTerm": [
    {
      "@type": "DefinedTerm",
      "name": "Conversion Rate",
      "description": "The percentage of website visitors who complete a desired action, such as making a purchase or signing up for a newsletter. Calculated by dividing conversions by total visitors and multiplying by 100.",
      "url": "https://example.com/glossary/marketing-metrics#conversion-rate",
      "inDefinedTermSet": "https://example.com/glossary/marketing-metrics"
    },
    {
      "@type": "DefinedTerm",
      "name": "Bounce Rate",
      "description": "The percentage of visitors who leave a website after viewing only one page without taking any further action.",
      "url": "https://example.com/glossary/marketing-metrics#bounce-rate",
      "inDefinedTermSet": "https://example.com/glossary/marketing-metrics"
    },
    {
      "@type": "DefinedTerm",
      "name": "Click-Through Rate",
      "description": "The ratio of users who click on a specific link to the number of total users who view a page, email, or advertisement. Often expressed as a percentage.",
      "url": "https://example.com/glossary/marketing-metrics#click-through-rate",
      "inDefinedTermSet": "https://example.com/glossary/marketing-metrics"
    }
  ]
}
</script>

A few things to notice here. Each term has its own anchor URL using a fragment identifier (the #conversion-rate part). This is important because AI engines and crawlers can then link directly to the specific definition rather than just the top of your glossary page. It makes each individual definition more citable.

Single-term pages vs. full glossary pages

Some sites prefer individual definition pages, one URL per term, rather than a single long glossary. This is actually a strong structure for AI visibility because each page has a unique URL and can rank on its own. For this setup, place a DefinedTerm block on each individual page and a DefinedTermSet block on the parent glossary index page. The inDefinedTermSet property on each individual term then points to the parent index URL, creating a clear relationship in the structured data.

Adding termCode for extra precision

The termCode property is optional but worth using if your industry uses standard abbreviations or codes. For a finance glossary, you might set "termCode": "APR" for Annual Percentage Rate. For a technical glossary, acronyms work perfectly here. AI systems that encounter a query using the abbreviation are more likely to surface your definition if the code is explicitly declared in the schema.

How AI engines process DefinedTermSet markup

Large language models are trained on web crawls. During that training and in retrieval-augmented generation (RAG) systems like Perplexity, structured data acts as a signal of confidence. When a page clearly declares "this is a defined term, this is its name, this is its description", the machine does not have to guess. Guessing introduces uncertainty, and uncertainty means your page is less likely to be selected as a citation source.

Think of it this way: two pages define "cost per acquisition". One is a blog post that mentions the definition in the third paragraph, buried in a wall of text. The other is a properly marked-up DefinedTerm with a clean, precise description field. The structured data page is making a clear assertion. The blog post is not. AI engines prefer clear assertions.

There is also a practical crawling benefit. Googlebot, GPTBot, ClaudeBot, and other AI crawlers all consume JSON-LD. A well-formed DefinedTermSet block gives them an unambiguous map of your content. You can learn more about how AI crawlers interact with structured data by reading our post on what GPTBot is and how to let it crawl your site.

Writing descriptions that actually get cited

The schema itself is only half the job. The description field is what AI engines pull into their answers, so it needs to be written with that in mind. Here are the rules we follow at FlinnSchema when writing definition content for clients:

Lead with the core definition in the first sentence

Do not warm up to the definition. State it immediately. "Conversion rate is the percentage of visitors who complete a desired action." That first sentence is the one most likely to be extracted and cited. Everything after it is supporting context.

Keep descriptions between 40 and 120 words

Too short and the definition lacks enough detail to be authoritative. Too long and AI systems may truncate it awkwardly or overlook it in favour of a crisper source. Forty to one hundred and twenty words is the sweet spot based on what tends to surface in AI answers.

Use plain, direct language

Avoid jargon within the definition unless you also define that jargon. If your definition of "cost per acquisition" uses the phrase "customer lifetime value" without explaining it, you are creating a gap. Plain language also tends to score better because AI engines are optimising for user comprehension.

Include a formula or metric where applicable

For numerical or analytical terms, a simple formula in the description dramatically increases citation likelihood. "Calculated by dividing total ad spend by the number of new customers acquired" is far more useful to an AI answer than a vague prose description. Specific information is more citable than vague information, full stop.

Technical implementation on Shopify and WordPress

On Shopify, the cleanest approach is to add your JSON-LD block to a custom page template. Create a dedicated template for your glossary pages (for example page.glossary.liquid) and drop the DefinedTermSet script block into the template's <head> or just before the closing </body> tag. You can make it dynamic by pulling term names and descriptions from metafields if you have a large glossary.

On WordPress, the approach depends on whether you use a plugin or hand-code. If you are using Rank Math or Yoast for other schema types, be careful about duplicate schema conflicts. It is often cleaner to add DefinedTermSet via a small custom plugin or via the theme's functions.php using wp_head. Our post on stopping Yoast from adding duplicate schema covers how to manage this without breaking your existing structured data setup.

Always validate your output using Google's Rich Results Test and Schema.org's validator. Common errors include missing @context declarations, malformed JSON (a stray comma kills the whole block), and url values that do not resolve to real pages.

Pairing DefinedTermSet with other schema types

DefinedTermSet does not have to work alone. On a page that is also an article or a guide, you can include both an Article (or WebPage) schema block and a DefinedTermSet block in separate JSON-LD scripts. They do not conflict. The Article block helps AI engines understand the editorial context, while the DefinedTermSet block makes each individual term citable.

If your glossary page includes FAQs like "what is the difference between CPC and CPM?", consider adding a FAQPage schema block as well. Stacking complementary schema types is a legitimate and effective strategy, provided each block is accurate and properly formed. For more on how FAQ schema performs in AI citation contexts, take a look at our analysis of whether Rank Math's FAQ schema works for ChatGPT and Perplexity citations.

Monitoring whether your definitions are getting cited

There is no direct "citation report" in Google Search Console for AI engines yet, so you have to be a bit creative. Run the specific definition queries in ChatGPT, Perplexity, and Gemini manually. Search for "what is [your term]" and see whether your site is referenced. Perplexity in particular tends to show citations explicitly, making it the easiest platform for monitoring.

Set up a simple tracker: a spreadsheet with the term name, the AI engine queried, the date, and whether your site was cited. Check monthly. If you implement DefinedTermSet markup and see a term start appearing in AI answers within 60 to 90 days, that is a strong signal the schema is working. If specific terms are not being picked up, revisit the description quality before assuming the schema is the problem.

If you want a professional review of how your current structured data is performing across AI engines, our free AI visibility audit will surface exactly where the gaps are.

Frequently Asked Questions

Does Google officially support DefinedTermSet in rich results?

Not as a rich result type with a visual enhancement in the SERP, at least not at the time of writing. Google recognises the Schema.org types and processes them, but there is no dedicated rich result format like there is for FAQs or Products. The value here is specifically for AI engine citation and for giving any machine reading your content an unambiguous, structured signal about what your terms and definitions are.

How many terms should I include in a single DefinedTermSet?

There is no hard limit, but practical considerations apply. A single JSON-LD block with 200 terms becomes unwieldy and may increase page load slightly. More importantly, a glossary page with 200 terms on one URL is harder to get indexed deeply. We generally recommend either paginating large glossaries by category (each category gets its own DefinedTermSet) or using individual term pages linked from a parent index. Thirty to fifty terms per set is a manageable range for most sites.

Can I use DefinedTermSet for product terminology as well as general glossaries?

Absolutely. If you sell software, technical equipment, financial products, or anything with its own vocabulary, a branded glossary with DefinedTermSet markup is a strong AI visibility play. It positions your brand as the definitional authority for your product category. When AI engines answer "what is [your product type]", they may well pull from your page rather than a generic source like Wikipedia.

How is DefinedTermSet different from FAQPage schema?

FAQPage schema is structured as question-and-answer pairs, and the questions are written conversationally. DefinedTermSet is structured as term-and-definition pairs, where the "question" is always implicitly "what is this term?" They serve different content types. Use FAQPage for "how do I...", "why does...", and "what should I..." style content. Use DefinedTermSet specifically for glossary and definitional content. Using the right type for the right content matters because AI engines interpret schema types semantically, not just syntactically.

Want to check your AI visibility?

Run a free audit on your website and see how visible you are to ChatGPT, Perplexity, and other AI search engines.

Run Free Audit
How to Use DefinedTermSet Schema to Get Cited in AI Glossaries