Why would my brand show up in ChatGPT but not Perplexity?

ChatGPT primarily pulls from training data, so brands with cumulative authority across many sources over time tend to appear there. Perplexity is a live-retrieval engine that crawls the web in real time and extracts passages from current pages. If PerplexityBot is blocked on your site, or if your content isn't structured for clean passage extraction, Perplexity may skip you entirely even if you rank well in Google.

What is PerplexityBot and do I need to allow it?

PerplexityBot is Perplexity AI's web crawler, similar to Googlebot but for AI-generated answers. If it's blocked in your robots.txt or by a WAF/CDN rule, Perplexity cannot retrieve your content for live answers. You should explicitly allow PerplexityBot, along with GPTBot, ClaudeBot, Google-Extended, and cohere-ai, unless you have a specific legal reason not to.

How does Gemini decide which brands to cite?

Gemini is deeply integrated with Google's Knowledge Graph, so it leans heavily on entity authority signals: Knowledge Panels, Organization schema markup, consistent brand mentions across trusted external sites, and clear on-site signals about what your brand is and does. If Google's entity understanding of your brand is weak, Gemini's generative layer inherits that uncertainty.

What does 'passage-level retrievability' mean in GEO?

AI engines don't just read your whole page — they try to extract specific passages that answer a query. Passage-level retrievability means your content is written and structured so that a 2-3 sentence answer can be cleanly pulled from each section without reading the entire page. Short paragraphs, direct answers at the start of each section, and FAQ formats all improve extractability.

Is there one GEO strategy that works for all AI engines?

There's a shared foundation — open crawler access, clean passage-ready content, consistent entity signals, and broad third-party coverage — that improves your odds across all engines. But each engine has specific retrieval preferences: ChatGPT rewards training-data authority, Perplexity rewards recency and structure, Gemini rewards entity clarity, and Claude rewards breadth of coverage. You need both the shared foundation and engine-specific fixes.

How often should I test my brand's visibility across AI engines?

At minimum, run a structured cross-engine audit every quarter. Test 5-10 representative queries across ChatGPT, Perplexity, Gemini, and Claude and document where you appear and where you don't. If you've made significant content changes, updated schema, or published a major content piece, test again within a few weeks to see if it moved the needle.

Does blocking AI crawlers hurt my SEO?

Blocking AI crawlers like GPTBot or PerplexityBot has no direct effect on your Google organic rankings — those crawlers are separate from Googlebot. However, blocking them will prevent you from being cited in live-retrieval AI answers, which is a growing traffic and visibility channel. The decision is yours, but if brand visibility across AI engines is a goal, open access is necessary.

Back to Blog

AI Search•

June 17, 2026

•

10 min read

Why Your Brand Shows Up in ChatGPT but Not Perplexity (or Gemini)

Matt Weitzman

Senior SEO Strategist & Co-Founder

You search for your brand in ChatGPT and there you are — mentioned by name, cited in the answer, looking credible. Then you open Perplexity and ask the same question. Nothing. Or worse, a competitor shows up where you should be. If you've felt that particular frustration, you're not imagining things. Brand visibility across AI engines doesn't work like a single leaderboard. Each engine has its own retrieval pipeline, its own crawler schedule, and its own idea of what makes a source worth citing. Understanding those differences is the whole game right now in Generative Engine Optimization (GEO).

This isn't an exotic edge case. I've seen it come up repeatedly on the conference circuit — brands that have done solid SEO work, built real authority, and still get inconsistent citations across engines. The fix isn't random. Once you understand why the pipelines diverge, you can address each gap deliberately.

The Core Problem: These Engines Are Not the Same Machine

Most marketers assume AI search works like Google — one index, one ranking system, one set of signals. That assumption will cost you. ChatGPT, Perplexity, Gemini, and Claude each have fundamentally different architectures for how they surface information about brands, products, and topics.

There are three main dimensions where these engines diverge: how they retrieve information (training data vs. live web retrieval), how they weigh authority vs. recency, and whether their crawlers can actually access your site. Get any one of those wrong for a specific engine and you disappear from its answers — even if you're well-established everywhere else.

Retrieval Pipeline Differences: Training Data vs. Live Web

ChatGPT's base model runs on training data with a knowledge cutoff. When you see your brand cited there without a live search, that's because you made it into OpenAI's training corpus — probably through third-party coverage, forum mentions, documentation, or content that was widely crawled before the cutoff. That's an authority signal baked into the model weights.

Perplexity is different. It's built as a real-time retrieval engine. It searches the live web, pulls passages from current pages, and synthesizes answers from what it finds right now. If your site wasn't crawled recently, or if your content isn't structured in a way that yields clean extractable passages, Perplexity may not cite you even if you rank well in Google.

Gemini sits in between. It combines Google's index and Knowledge Graph with generative synthesis, which means entity recognition and structured data matter enormously. If your brand isn't clearly defined as an entity in Google's understanding of the web, Gemini has less to work with — and it shows.

Claude (from Anthropic) tends to lean heavily on its training data for brand knowledge and is more conservative about citing sources it hasn't seen repeatedly across multiple contexts. Getting into Claude's answers often requires broader third-party coverage across trusted publications.

Crawler Access: The Silent Blocker Nobody Checks

Here's something that trips up even experienced teams. Each AI engine sends its own crawler. OpenAI uses GPTBot. Perplexity uses PerplexityBot. Google AI Overviews use Google-Extended. ClaudeBot crawls for Anthropic. If any of those bots are blocked in your robots.txt — intentionally or by accident — that engine can't retrieve your content for live answers.

And yes, this happens more than most technical teams want to admit. A developer adds a blanket disallow rule during a site migration, or an overzealous CDN security rule blocks unfamiliar user agents. Suddenly Perplexity can't see your most important pages and you have no idea why you're not being cited.

Check your robots.txt right now. Specifically look for rules that block these user agents: GPTBot, PerplexityBot, Google-Extended, ClaudeBot, cohere-ai. According to research published by Originality.ai on AI bot blocking, a significant portion of major websites still block at least one of these crawlers. Don't let that be your brand.

Quick Crawler Access Checklist

Open yoursite.com/robots.txt and search for GPTBot, PerplexityBot, ClaudeBot, Google-Extended, cohere-ai
Make sure none of these are listed under Disallow rules
Check your CDN or WAF settings for bot-blocking rules that might catch these agents
If you added a blanket disallow during a site migration, audit it now
Confirm Cloudflare or similar security tools aren't classifying AI crawlers as threats

Authority Weighting: Why ChatGPT Trusts You and Perplexity Doesn't

ChatGPT's training-based citations tend to reward cumulative authority over time — brands that appeared consistently across many sources over years have a higher signal density in the model. That's closer to traditional SEO authority. If you've been around a while, have press coverage, and have been mentioned in educational or community content, you're well-positioned in training-data-heavy models.

Perplexity weights recency and passage clarity much more heavily. A newer brand with clean, well-structured pages and recent coverage can outperform an older brand that has dusty, dense content. The engine is pulling live snippets, so it rewards pages that answer questions clearly and directly — not pages that bury the answer in four paragraphs of background.

Gemini's weighting is tied closely to Google's entity graph. It favors brands that have a well-defined Knowledge Panel, consistent NAP data (for local brands), structured schema markup, and strong internal linking that reinforces topical authority. If your entity signals are weak in Google's eyes, Gemini's generative layer inherits that weakness.

This is why a single GEO strategy won't solve everything. You need to think about each engine's retrieval preferences separately, then find the overlapping fixes that serve all of them.

The Passage-Level Problem: Are You Actually Extractable?

One of the most underrated concepts in GEO right now is passage-level retrievability. AI engines don't just index your page — they try to extract the specific passage that answers a question. If your content is written in dense, unbroken paragraphs with no clear question-answer structure, the engine may skip your page entirely in favor of one that hands it a clean, quotable answer.

I've seen pages with strong domain authority get zero citations in live-retrieval engines simply because the content wasn't structured for extraction. The fix is often simpler than you'd expect: break up your content, use direct answers early in each section, write in clear declarative sentences, and use subheadings that match how people actually ask questions. This is equally true for ranking in both SEO and AI search in 2026.

How to Make Your Content Passage-Ready

Lead each section with a direct answer, then support it — not the other way around
Write subheadings as questions or clear declarative phrases ("What GPTBot Can Access" beats "Technical Notes")
Keep paragraphs to 2-4 sentences — short blocks are easier to extract as discrete passages
Use bullet lists and numbered steps for any procedural content
Avoid burying key claims in qualifications or conditional language
Add an FAQ section to every important page — FAQ format is natively extraction-friendly

Entity Authority: The Gemini-Specific Gap

If Gemini specifically isn't citing you, entity authority is the first place to look. Gemini's generative layer is deeply integrated with Google's Knowledge Graph, which means it needs to confidently "know" what your brand is — not just that your domain exists.

Strong entity signals include: a Wikipedia or Wikidata entry (if your brand qualifies), a Google Knowledge Panel, consistent brand mentions across high-authority publications, structured data markup (Organization, Product, FAQPage schemas), and a well-linked About page that clearly states what you do, who you serve, and where you operate.

Think of it this way — Gemini is asking "do I know enough about this brand to responsibly include it in an answer?" The more entity-confirming signals you have scattered across the web, the more confidently it can say yes.

Cross-Engine Diagnostic: A Fast Audit Framework

Before you start fixing things, figure out exactly where you stand. The goal is to test your brand's citation behavior across engines in a structured way, not just spot-check once and assume.

Test 5-10 brand-relevant queries across ChatGPT, Perplexity, Gemini, and Claude. Use both branded queries ("[Your Brand] reviews") and category queries ("best [your category] tools"). Document where you appear and where you don't.
Audit crawler access via robots.txt and your CDN/WAF settings for all five major AI bots.
Check your entity footprint: do you have a Knowledge Panel? Is your brand mentioned consistently on third-party sites? Does your schema markup correctly identify your organization?
Audit passage clarity on your most important pages: can you extract a clean 2-3 sentence answer to your target query from each page without reading the whole thing?
Review recency: when was your most authoritative content last updated? Perplexity especially rewards freshness.
Map the gaps: which engines miss you? That tells you whether the problem is training-data authority (ChatGPT gap), live retrieval structure (Perplexity gap), entity signals (Gemini gap), or broad coverage (Claude gap).

Closing the Gaps: Engine-Specific Fixes

If ChatGPT Isn't Citing You

The ChatGPT gap is almost always a training-data authority problem. You need broader third-party coverage — guest posts on recognized publications, citations in industry guides, mentions in community forums like Reddit, and documentation that gets widely linked. This is a slow build, but it compounds. The ChatGPT citation playbook covers the specific tactics in detail.

If Perplexity Isn't Citing You

Start with crawler access, then move to passage clarity. Perplexity is a live-retrieval engine — it needs to be able to crawl you and extract clean answers. After you've confirmed PerplexityBot isn't blocked, audit your top pages for extractability. Write cleaner, more direct content. Update it regularly. Perplexity rewards brands that publish fresh, well-structured answers to questions people are actually asking right now.

If Gemini Isn't Citing You

This is an entity problem 90% of the time. Strengthen your Organization schema, build toward a Knowledge Panel, get mentioned consistently on authoritative external sites, and make sure your brand's core identity signals are crystal clear across your own site. Think of it as convincing Google's knowledge graph that you're a real, well-defined entity — not just a domain.

If Claude Isn't Citing You

Claude is conservative by design. It tends to cite brands it has seen repeatedly across diverse, trustworthy contexts in its training data. Breadth of coverage matters here — not just one big press mention, but consistent presence across a range of credible sources. Technical documentation, industry wikis, educational content, and long-form editorial coverage all contribute.

The Signals That Work Across Every Engine

For all the variation between engines, there's a core set of signals that improve your odds everywhere. These are the investments worth making regardless of which engine is your priority.

Open crawler access to all major AI bots — no disallow rules for GPTBot, PerplexityBot, ClaudeBot, Google-Extended
Passage-ready content structure on every important page
FAQ sections on product, service, and resource pages
Consistent third-party brand mentions across high-authority domains
Organization and FAQPage schema markup implemented correctly
Regular content updates that signal freshness to live-retrieval engines
A clear, well-linked About page that defines your brand as an entity

Where to Start

Run the cross-engine diagnostic above before you change anything. Test 5-10 queries, document your gaps, and let the pattern tell you where to focus. If you're missing in Perplexity but not ChatGPT, start with crawler access and passage clarity. If Gemini is the gap, start with entity signals. If it's ChatGPT, you're playing a longer game that requires building training-data authority through broader coverage.

GEO isn't one fix. It's a set of engine-specific fixes that share common foundations. The brands that win across all AI engines are the ones who understand that each engine is asking a slightly different question about whether to trust and cite them.

If you want a faster read on where you stand right now, Aergos has a free AI visibility checker that shows you how your brand is being cited across the major AI engines — a solid place to anchor your audit.

Frequently Asked Questions

AI Search

AthenaHQ Review and Alternatives: Is It Right for Your Agency?

AI Search

How LLM Training Cutoffs Quietly Sabotage Your Brand's AI Visibility

AI Search

How to Check If Your Website Appears in ChatGPT Product Recommendations

Glossary terms in this article

Brush up on the definitions.

Topical Authority

The perceived depth and breadth of expertise a website demonstrates on a subject area, influencing how search engines rank its content.

Domain Authority

Moz's proprietary 1–100 score predicting how likely a domain is to rank in search engine results, based on its link profile.

Internal Linking

Hyperlinks that connect pages within the same website, distributing link equity, improving crawlability, and helping users navigate related content.

Knowledge Panel

An information box that appears on the right side of Google SERPs, displaying facts about entities like brands, people, and places.

Knowledge Graph

A structured database of entities and their relationships that search engines use to understand and connect real-world concepts.

Structured Data

A standardised format for providing information about a page and classifying its content so search engines can better understand it.

About Matt Weitzman

Senior SEO Strategist & Co-Founder

Matt has over 15 years of experience in technical SEO and digital marketing. He specializes in algorithmic recovery, enterprise architecture, and leveraging AI for content scaling. He is a frequent speaker at search marketing conferences.

Why Your Brand Shows Up in ChatGPT but Not Perplexity (or Gemini)

The Core Problem: These Engines Are Not the Same Machine

Retrieval Pipeline Differences: Training Data vs. Live Web

Crawler Access: The Silent Blocker Nobody Checks

Quick Crawler Access Checklist

Authority Weighting: Why ChatGPT Trusts You and Perplexity Doesn't

The Passage-Level Problem: Are You Actually Extractable?

How to Make Your Content Passage-Ready

Entity Authority: The Gemini-Specific Gap

Cross-Engine Diagnostic: A Fast Audit Framework

Closing the Gaps: Engine-Specific Fixes

If ChatGPT Isn't Citing You

If Perplexity Isn't Citing You

If Gemini Isn't Citing You

If Claude Isn't Citing You

The Signals That Work Across Every Engine

Where to Start

Frequently Asked Questions

Related Articles

AthenaHQ Review and Alternatives: Is It Right for Your Agency?

How LLM Training Cutoffs Quietly Sabotage Your Brand's AI Visibility

How to Check If Your Website Appears in ChatGPT Product Recommendations

Glossary terms in this article

About Matt Weitzman