The Answerability Index · commercial insurance brokers · pilotReal capture · 2026-05-28

We asked five AI systems which commercial insurance brokers to consider. They named 72 — and agreed on almost none.

No broker is the consensus pick. Ask ChatGPT and it leans on the global giants — Marsh, Aon, Lockton. Ask Perplexity the same question and it reaches for regional and specialty shops the others never mention. Commercial insurance brokerage is a molten category: the answer is still contestable.

72
brokers named
across six buyer questions
0.38
inter-engine overlap
Jaccard · 1.00 = full agreement
0%
unanimous top broker
on any single prompt
27%
held by the top three
Aon · Hub · WTW, of all appearances

Executive takeaway

Commercial-insurance retrieval is fragmented by risk type, company size, and specialty language — not settled. Broad broker authority wins the generic questions; explicit, machine-readable specialization wins the high-intent ones (cyber, D&O, construction). The consequence for a broker: AI is already reshaping discovery — which firms make the shortlist — well before it touches placement, which still runs on human expertise.

What this page measures

Each row is not a ranking. It is observed surfacing — how often a company entered the AI-mediated consideration set across a bounded battery of buyer questions. The heatmap maps citation territory: for each question the engines repeatedly surface a small set of companies, and those companies currently hold the answer layer for that question. The question is not "who is best?" — it is "who appears when the buyer asks?"

Observed surfacing — not endorsement, advice, or a suitability judgment. This measures observed AI surfacing behavior, not broker quality, broker suitability, or recommendation quality. Insurance is regulated and varies by state and line; nothing here is insurance advice or a solicitation. These pages sit inside the same Content / Retrieval / Trust architecture as the rest of our working papers on AI-mediated buyer discovery.

What AI returns depends on the kind of risk being placed

Observation  The 0.38 overlap is an average, and it hides the real pattern: hold one buyer question fixed, vary only the engine, and not only do the firms change — the type of firm changes with the question. Across the six situations we tested, four distinct retrieval patterns appeared.

Buyer situationWhat AI surfacesWhy (mechanism)Implication
Established & generic
mid-market manufacturer
National brokers — Lockton, Aon, Marsh. The most concentrated question (overlap 0.27).Breadth plus the largest corroborated public footprint; the engines reach for the obvious incumbents.Incumbents dominate; on-page work alone rarely displaces them.
Complex / multi-line
multi-location placement
The widest, most fragmented field — 23 firms, overlap 0.11, no consensus.No single firm is corroborated as "the" answer for complex risk, so the engines spread.Open territory — specialization and segment content can claim it.
Digital-native
startup cyber & E&O
Insurtech wins: Embroker (named by all five engines), Founder Shield, Vouch — the global brokers fall back.Content-first, machine-readable firms, corroborated in startup channels, out-surface size.A legible, answer-shaped digital presence beats incumbency.
Product-led
construction liability
Carriers appear alongside brokers.The query implies a coverage product, not an advisor, so distribution shifts upstream.Brokers must own the advisory framing to stay in the set.

Implication  "AI visibility" is not one thing in insurance — it is per line of business. A broker can own the manufacturer answer and be entirely absent from cyber. A single score would hide that; the measurement has to be taken line by line.

Why insurance retrieval fragments

Most categories settle on a canonical answer. Commercial insurance resists it for structural reasons — and knowing them is the difference between "we should do some SEO" and knowing which page to build.

Category temperature: how settled is the answer?

Frozen

High cross-engine overlap (Jaccard ≥ ~0.55) and low rank variance — the engines have converged on a small canonical set. On-page changes alone are unlikely to displace the top tier; the question shifts to defending edge cases.

Molten

Moderate overlap (~0.30–0.55) and high churn across engines — the answer has not set. Adjacent firms enter and exit the set, and answer-shaped, entity-clear, retrievable content can still claim citation territory.

Commercial insurance brokers are a molten category — mean inter-engine overlap 0.38, and no prompt produced the same top broker across all five engines. For calibration, US airlines is frozen (0.64); industrial machinery is molten (0.34).

TemperatureWhat it meansWhat tends to work
FrozenThe same firms across every engineBrand authority and corroboration; on-page work alone rarely displaces the top tier
MoltenThe set reshuffles by engine; no consensusClaim territory — answer-shaped, entity-clear content can still change who gets named
FragmentedA different answer for each buyer questionPersona- and line-specific retrieval surfaces — win each question, not "the category"

How to read this

FROM PROMPT TO CITATION TERRITORY Buyer prompt Five AI systems Companies surfaced Overlap & σ Citation territory Surfacing rate — how often a company appears Overlap — how much the engines agree on the answer set σ divergence — how unevenly the engines treat a given company
Observed surfacing & cross-engine divergence5 engines · 6 prompts · 3 runs/engine · captured 2026-05-28
ChatGPTClaudeGeminiPerplexityGrokOVERALLσDIVERGEHub International50678383837313MarshA100835050677019Aon100675033836724Gallagher83836717836726LocktonB100335017174331Willis Towers Watson83673317174327Alliant0336717673727Brown & Brown171700501718EPIC Insurance Brokers017500171718Embroker1717171717170Founder Shield1717171717170Acrisure001733171312Amwins03317001013Bowen Miclette & Britt00171717108
Surfacing rate0%100%· share of the 6 prompts in which the broker was surfaced
σ — cross-engine divergence (std. dev. across the 5 engines)

Hover or tap any engine cell above to see the real prompts behind that number.

One question, five answers

The clearest way to see fragmentation is to ask all five engines the same question and read across. Here is the broker each one recommends for a venture-backed startup buying cyber & E&O coverage — top three, by surfacing. ChatGPT names the global giants; every other engine has moved to insurtech.

ChatGPT
Aon
Marsh
Lockton
Claude
Embroker
Corgi
Woodruff Sawyer
Gemini
Embroker
Insureon
Founder Shield
Perplexity
Founder Shield
Embroker
Nadler
Grok
Embroker
Alliance Risk
Founder Shield
National brokerInsurtech / digitalRegional / specialty

Top three per engine for "which broker should a venture-backed startup use for cyber & E&O?" — observed across three runs, 2026-05-28. Only ChatGPT returns the incumbents; the other four surface digital-native firms. The same buyer, on a different engine, meets a different market. What that means for a broker →

The six prompts behind these numbers (commercial / B2B buyer-intent)
  1. Which commercial insurance brokers are commonly recommended for mid-market manufacturers?
  2. Which insurance brokers are commonly suggested for venture-backed startups buying cyber and E&O coverage?
  3. Which commercial insurance brokers are commonly surfaced for multi-location businesses with complex risk?
  4. Which insurance brokers are commonly recommended for construction firms with liability exposure?
  5. Which commercial insurance brokers are commonly recommended for healthcare practices and groups?
  6. Which insurance brokers are commonly suggested for placing management-liability and D&O coverage?

Scope — commercial brokers, US. We measured the firms that help a business buy insurance (brokers and agencies), not carriers/underwriters; carriers that surfaced (e.g. Chubb, Travelers, The Hartford) were set aside as a different question. Broker names were canonicalized to a parent where engines used variants (e.g. "Arthur J. Gallagher & Co." → Gallagher; the Marsh-family names → Marsh). The rows shown are the brokers surfaced across multiple engines; a long tail of single-engine regional shops is summarized in the notes, not named individually.

Strategic reading: a molten field is contestable

Unlike a frozen category, no single broker owns the commercial-insurance answer. The most-surfaced names are large national brokers — but even they are named inconsistently, and a wide field of regional and specialty firms surfaces on one engine and not the others. For a broker, that is the opportunity: the answer is not locked.

What appears to move surfacing here, in rough order: corroboration density (how redundantly the web names you for a line and a segment), entity clarity (whether your firm resolves cleanly across acquired brands and regional offices — a recurring problem in a roll-up-heavy industry), retrieval surface (whether an AI crawler can reach your practice and line-of-business pages), and answer-shaped content for the specific buyer question — cyber for startups, D&O, contractor liability. A broker named by only one engine has a legible exposure, and a path to claim the territory the others haven't settled. Hover any cell above to see the prompts behind each number.

Two engines, two different brokers

The clearest pattern is not a ranking — it is that the engines have personalities. ChatGPT behaves like a brand-name search: it returns the global brokers (Marsh, Aon, Lockton, Gallagher) on almost every prompt. Perplexity behaves like a directory crawl: it ranks those giants low and surfaces regional and specialty brokers — names no other engine mentions — drawn from agency directories and local "best brokers" lists. Claude, Gemini, and Grok sit between the two. The same buyer need, asked of two systems, returns two different shortlists.

And one firm is punching above its size. Hub International surfaces most overall — ahead of both Marsh and Aon — despite not being a top-tier global broker by revenue. In a molten category, a deep, machine-readable content footprint can out-surface market share: AI prominence and real-world size are not the same ranking.

Search builds a candidate list. AI builds the consideration set.

Observation  Search hands a buyer a candidate list — a page of links to work through. The five engines hand back a consideration set — a short, named shortlist — before the buyer clicks anything. The shortlisting has already happened.

Mechanism  The model has already done the shortlisting. It read the directories, the trade press, and the firms' own pages, and handed back a handful of names. In this capture the engines leaned on Risk & Insurance, Insurance Journal, and broker directories more than on the firms' own claims — earned media decides it.

Implication  The consideration set now forms privately, inside the model, before a sales conversation. If your firm is not in it, you are not losing the deal — you are never in the running. See what your firm's set looks like →

Observed hypotheses

Patterns this capture is consistent with — stated as hypotheses, not conclusions, given a bounded sample.

  1. Specialization is legible. Firms with explicit vertical or line specialization appear more often on specialized prompts — insurtech on cyber/startup, specialty brokers on complex placement.
  2. Breadth wins the generic question. Broad, established prompts favor the largest national brokers and the directories that aggregate them.
  3. Advice versus product shifts the surface. When a prompt implies a coverage product, carriers enter the set; when it implies advisory or complex placement, brokers dominate.
Methodology note. A bounded pilot capture: 5 AI systems, 6 commercial buyer-intent prompts, 3 runs per engine, captured 2026-05-28. We measured commercial insurance brokers (firms that place coverage), not carriers; carrier names that surfaced were set aside. Rows show observed surfacing within this prompt battery — not endorsements, quality rankings, broker suitability, or general market-share estimates. Broker names were canonicalized from extracted outputs; ambiguous aliases were reviewed. The Answerability Index · pilot.

Research publication based on sampled AI outputs collected on 2026-05-28. Findings reflect observed outputs in this sample and are not statements of company quality, broker suitability, recommendation quality, or business performance, and are not insurance advice or a solicitation.