ServicesAI Audit
← Back to Blog

How AI Search Engines Decide Which Businesses to Recommend

GEOAI VisibilitySchema Markup

Every time someone asks ChatGPT "What is the best Italian restaurant near me?" or asks Perplexity "Which agency can help with Shopify SEO?", an AI engine has to decide which businesses to mention in its answer. It does not show ten blue links. It picks a handful of names, describes them, and often recommends one over the rest.

The question every business owner should be asking is: how does it choose?

This is not a mystery. The mechanisms are well-documented, and the field of Generative Engine Optimisation (GEO) exists specifically to address them. Here is what the research and our own testing across four major AI engines tells us about how these systems actually work.

AI Search Is Not Google Search

The first thing to understand is that AI-powered search works differently from traditional search at a fundamental level.

Google ranks web pages. It scores them based on relevance, backlinks, page authority, and hundreds of other signals, then shows a list of links. The user clicks through and reads the page themselves.

AI search engines generate answers. They search the web in real time, read multiple pages, synthesise the information, and produce a written response that directly answers the user's question. The user never needs to click through to your site. Your business either appears in that generated answer, or it does not exist in that channel.

According to Gartner's 2024 forecast, traditional search engine volume is expected to drop 25% by 2026 as users shift to AI-powered search. That shift is already happening. Understanding how AI decides which businesses to recommend is no longer optional.

How Retrieval-Augmented Generation Works

Most AI search engines use a process called Retrieval-Augmented Generation, or RAG. The simplified version works like this:

  1. Query interpretation: The AI parses the user's question to understand what they are actually looking for
  2. Web retrieval: The AI sends search queries to the web and retrieves relevant pages (this is why AI crawler access in your robots.txt matters so much)
  3. Content extraction: The AI reads each retrieved page, pulling out facts, names, descriptions, and supporting evidence
  4. Answer generation: The AI synthesises everything it found into a coherent, natural-language answer, citing specific businesses where appropriate

The critical step is number 3. When the AI reads your page, it needs to extract structured, accurate information about your business quickly. If your page is a wall of marketing copy with no clear structure, the AI may skip it in favour of a competitor whose site is easier to parse.

The Five Factors That Determine Whether You Get Cited

Based on our testing across ChatGPT (OpenAI), Perplexity, Gemini (Google), and Grok (xAI), and supported by published research on LLM retrieval behaviour, there are five primary factors that influence whether a business gets recommended.

1. Structured Data (Schema Markup)

This is the single most influential factor. Schema.org markup, implemented as JSON-LD, tells AI engines exactly what your business is, what you offer, where you are located, and what your customers say about you.

The key schema types that AI engines rely on include Organisation, LocalBusiness, Product, Service, FAQPage, and Review. Google's own structured data documentation explains the technical format, but the principle is simple: if you provide machine-readable data about your business, AI can use it accurately. If you do not, AI has to infer everything from unstructured text, and inference is unreliable.

In our complete guide to schema markup, we cover every type that matters for AI visibility and how to implement them correctly.

2. AI Crawler Access

AI engines send their own web crawlers to read your pages. GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, GoogleOther, and Bytespider (ByteDance) each need explicit or implicit permission in your robots.txt file to access your content.

Many websites unintentionally block these crawlers, either through overly restrictive robots.txt rules or because their hosting platform blocks non-standard user agents by default. If AI crawlers are blocked, your content simply does not exist to those AI engines. No amount of content quality or schema markup helps if the AI cannot read your pages in the first place.

3. Content Authority and Trust Signals

AI engines weigh trust signals heavily when deciding which sources to cite. This is similar to Google's E-E-A-T framework (Experience, Expertise, Authoritativeness, Trustworthiness), but AI engines operationalise it differently.

What matters most:

  • Third-party reviews: Ratings on Trustpilot, Google Business Profile, and industry-specific platforms provide independent validation that AI engines can verify
  • Citations and mentions: If other websites, forums (especially Reddit), and authoritative publications mention your business, AI engines are more likely to treat you as a credible source
  • Author and entity signals: Clear attribution of content to named individuals or established organisations adds trust weight

4. Content Structure and Comprehension

AI engines favour content that is clearly structured with proper heading hierarchies, concise paragraphs, and direct answers to common questions. Content that is written for AI comprehension differs from traditional SEO copy in a few ways:

  • It answers questions directly rather than teasing the reader to click through
  • It uses clear, factual language rather than vague marketing superlatives
  • It organises information under descriptive headings that AI can scan and categorise
  • It includes FAQ sections that match the exact phrasing real customers use

This does not mean your content has to be dry or robotic. It means it has to be genuinely helpful and well-organised. AI engines are remarkably good at distinguishing substance from filler.

5. Recency and Freshness

AI engines with web search enabled (which all four major engines now use) prefer recent, up-to-date content. Stale product pages, outdated pricing, and blog posts from three years ago carry less weight than current, actively maintained content.

This is one reason why regular content publishing and site maintenance matters for AI visibility, not just for SEO but because AI engines factor recency into their source selection process.

What GEO Does About All of This

Generative Engine Optimisation (GEO) is the practice of optimising your website specifically for AI-powered search engines. It is distinct from traditional SEO because it targets the factors that AI engines actually use when generating answers, rather than the factors Google uses when ranking links.

A proper GEO strategy addresses all five factors above:

  • Implementing comprehensive schema markup across your site
  • Ensuring all major AI crawlers have access to your content
  • Building and maintaining trust signals that AI can verify
  • Structuring content for AI comprehension, not just keyword targeting
  • Keeping your site content current and actively maintained

At FlinnSchema, we audit all of this across 26 weighted factors and test your visibility against real prompts sent to ChatGPT, Perplexity, Gemini, and Grok with live web search enabled. We do not use API-only calls that skip web search. We test exactly what a real user would experience. You can read about how our testing methodology works and why it matters.

The Proof That GEO Works

This is not theoretical. We have taken businesses from zero AI mentions to being cited by all four major AI engines. One e-commerce client went from a visibility score of 12 to 71 after implementing schema markup, restructuring content, and opening up AI crawler access. Another, a Shopify jewellery store, went from 0 schema types to 8 and saw their search impressions increase by 155%.

Every one of these improvements came from addressing the five factors above: structured data, crawler access, trust signals, content structure, and freshness. The results are documented and independently verifiable on our results page.

If you want to understand exactly how to get cited by ChatGPT and other AI engines, we have written a practical guide covering the specific steps.

Check Where You Stand

The best starting point is to see where your business sits right now. Run a free AI visibility audit and you will get a score across all 26 factors in under 60 seconds. No credit card, no sales call. Just data.

If your score is below 50%, there are likely quick wins you can implement yourself. If it is below 30%, your business is effectively invisible to AI search, and that is a channel you cannot afford to ignore as more customers shift from Google to AI-powered alternatives.

AI search is not replacing Google overnight. But it is growing, and the businesses that optimise for it now will have a significant advantage when it becomes the default way people find services and products. The question is whether you want to be early or whether you want to be catching up.

Want to check your AI visibility?

Run a free audit on your website and see how visible you are to ChatGPT, Perplexity, and other AI search engines.

Run Free Audit
How AI Search Engines Decide Which Businesses to Recommend