The retrieval layer nobody optimizes for
Every argument about "AI visibility" happens at the chatbot. The decisions about whether you make it into the answer happen one floor down — in search infrastructure most companies have never heard of, let alone optimized for.

When you ask ChatGPT about a company, the chatbot is the part you can see. Behind it sits a quieter machine: a system that turns your question into searches, sends those searches to one or more web indexes, and pulls back the handful of pages it will actually read. That middle system — the part that decides which documents even enter the room — is the retrieval layer. It is where being found is won or lost, and almost nobody is optimizing for it, because almost nobody knows it's there.
Retrieval Surface: the specific slice of documents, pages, and entity records that AI systems can reach, parse, and repeatedly surface when answering questions about a company — the machine-readable shadow your brand casts across embeddings indexes, structured feeds, and agentic search layers. It is not the same as "your website."
The reason this matters now, and didn't 18 months ago, is that the retrieval layer stopped being one thing. The answer layer is plural, embeddings-driven, and agent-consumed. Three shifts, each worth understanding.
The plumbing changed, and nobody told the marketers
For a decade, "search a webpage index programmatically" effectively meant the Bing Search API. A huge share of apps that needed real-time web data — including early AI products — were quietly built on it. Then Microsoft retired the Bing Search API on August 11, 2025, having disabled new API keys months earlier, in March.1 Apps broke; pipelines had to be re-plumbed.
Microsoft's own replacement, Grounding with Bing Search inside Azure AI Foundry, isn't a like-for-like swap: it returns synthesized answers with citations rather than raw results, and it comes with platform lock-in. Independent replacements exist, but reporting at the time put their cost anywhere from 40% to 483% higher than the old API.1 A boring piece of infrastructure went away, and the boring change reshaped who AI systems can see.
A new layer rose: search built for machines
Into that gap came a category of search engines whose customer is not a human but a model. They return clean, structured, token-efficient content designed to drop straight into an LLM's context window. Described by the role each plays, as of May 2026:
| Layer | What it is | What it returns to the model |
|---|---|---|
| Exa | Neural / embeddings search (plus keyword and an auto mode); built for agents and RAG | Concept-matched excerpts — "highlights" — for roughly a 90% token reduction vs full pages2 |
| Brave | An independently crawled web index — 35B+ pages, 100M+ changes/day3 | Pre-chunked, relevance-ranked markdown for grounding |
| Tavily | A search layer purpose-built for RAG pipelines | Aggregated results from up to 20 sites per call3 |
| Perplexity API | A retrieval + synthesis service | Summarized answers with citations, rather than raw context |
Roles, not rankings — and benchmarks deliberately disagree on a "best." In one independent 100-query test the top several providers were statistically indistinguishable.4 Which is the point: this is a plural layer, and no single index sees the whole web — or all of you.
The key word is embeddings. Exa and its peers don't match your keywords; they match concepts, by turning text into vectors and finding what sits nearest in meaning. So whether you surface for "battery storage challenges" can depend on whether your content reads, in vector space, like "difficulties accumulating renewable power" — even though you never used those words. Keyword SEO does not reliably move that.
You have a Retrieval Surface — and it's probably thin, and inconsistent
Here is the reframe. The chatbots you test are the surface; underneath, agents are querying these embeddings-native indexes, and you may simply not be in some of them. Being "on Google" does not mean being legible to Exa, retrievable from Brave's index, or chunked cleanly by Tavily. Your Retrieval Surface is the union of where the machine layer can actually reach you — and for most companies it is both thinner and more uneven than they assume.
Uneven, because each layer holds a different index and matches differently. So the same question can resolve differently — or not at all — depending on which retrieval path an engine took. That is not a glitch; it is the structure.
Path A
Firm A · Firm B · Firm C — built from encyclopedic + review-site sources
Path B
Firm B · Firm D · Firm E — built from community discussion and recent threads
Path C
your firm — not retrieved on this path
How to think about your Retrieval Surface
You don't need our help to start reasoning about this. The useful questions are concrete, and they map to the things the layer actually checks:
- Can the machine layer reach you? Is your important content in server-rendered HTML the crawlers can read — or trapped behind JavaScript and PDFs they can't? (This is the Retrieval pillar.)
- Can it parse you into a clean excerpt? Embeddings search rewards self-contained, answer-shaped passages it can lift; a brilliant point buried in paragraph nineteen is, to a vector index, hard to surface.
- Can it resolve you to an entity? Do consistent, corroborated records exist — so the index is confident "you" are a real, single thing — or are you a blur of mismatched mentions? (This is the Trust pillar.)
- Are you reachable for the right concepts? In meaning-space, does your content sit near the questions buyers actually ask, or near the wrong neighborhood entirely?
Answer those honestly and you'll already know more about your Retrieval Surface than most of your competitors know about theirs. None of it requires buying anything; it requires looking at the layer instead of the chatbot.
We map this layer for a living — measuring a company's Retrieval Surface across the five major engines is what the AI Answerability Diagnostic does. But the structure above holds whether or not you ever talk to us. That's rather the point: the retrieval layer is a real, describable thing, and it was worth describing.
References
- Bing Search API retirement (effective Aug 11, 2025; key creation disabled March 2025) and replacement guidance: Microsoft Lifecycle announcement. On replacement cost (40–483% higher) and the re-architecture impact: PPC Land (2025); The Register (2025).
- Exa product and benchmark claims (neural/keyword/auto; "highlights" ≈ 90% token reduction; 54.4% on FRAMES vs Perplexity 44.5% / Brave 21.6%): exa.ai; morphllm.com (accessed May 2026).
- Search-API roles and index sizes (Brave's independent index ~35B+ pages, 100M+ changes/day, pre-chunked markdown; Tavily aggregating up to 20 sites/call): AIMultiple, "Agentic Search" (2026) and vendor documentation.
- Benchmark indistinguishability among top providers (100-query evaluation): AIMultiple, "Agentic Search" (2026).