We measure inclusion for high-intent prompts, and we only count hits that match the selected entity.
A run starts by resolving your entity (the exact business you choose). During scoring, we only count a “hit” when the output can be attributed to that entity with high confidence (domain / website match, or strong identity match).
We force a single entity selection before we run prompts.
If the system can’t distinguish you from a similarly named business, it will default to safer, stronger entities. Entity resolution makes the rest of the run interpretable.
The selected entity is recorded (name, website/domain, location signals). That becomes the reference for evidence gating later.
Standardized, repeatable prompts designed to reflect buying intent.
Prompts are variations of “best {service} in {location}” (plus nearby intent phrasing). Pack size controls how many prompts are tested.
One prompt can be noisy. A pack shows whether you’re consistently included for the market you care about.
We run the same prompt pack across multiple generative surfaces.
Different systems retrieve, rank, and cite differently. Measuring more than one avoids building strategy on one model’s quirks.
Google’s AI features can appear selectively depending on the query and context, and are intended to include links for deeper exploration. :contentReference[oaicite:2]{index=2}
We reduce noise by using rank buckets instead of fragile exact ordering.
For each prompt+surface we assign a bucket. This produces stable aggregates even when outputs shuffle slightly.
If attribution fails (can’t tie the mention to your entity), the bucket becomes NOT_PRESENT. This prevents inflated scores from ambiguous name matches.
The score is computed from bucket weights, then adjusted by coverage.
We use consistent weights so the score is explainable:
TOP_3 = 1.00
TOP_5 = 0.70
TOP_10 = 0.40
NOT_PRESENT= 0.00
surface_score = average(bucket_weight over prompts)
base_0_100 = 100 * weighted_average(surface_score_i)
Coverage = how many selected surfaces include you at least once. We apply a multiplier so “present on 1/3 surfaces” can’t look like a dominant win.
coverage_multiplier:
1/3 surfaces present -> 0.50
2/3 surfaces present -> 0.75
3/3 surfaces present -> 1.00
AVI = round(base_0_100 * coverage_multiplier)
Confidence is measurement quality, not a guarantee.
These are trust-signal checks tied to discoverability and attribution.
Google explicitly frames Organization/LocalBusiness structured data as a way to help systems understand and disambiguate an entity.
Some sites publish an /llms.txt file as a helper for LLM/agent consumption. It’s not a guarantee of inclusion,
but it can make your canonical docs easier to find.
PDF is for humans; the artifact is for auditability.
The PDF contains: overall score, surface breakdown, coverage/confidence, and prioritized fixes.
The artifact exists so the result isn’t “trust me bro.” It should include:
Terms you’ll see in the report.
A 0–100 measurement of how often your business is included across selected AI surfaces for a standardized prompt pack.
Improving visibility in generative engine responses; “visibility metrics” and evaluation are central to the academic framing.
How many selected surfaces include you at least once. Used to avoid overvaluing a single-surface spike.
How stable and attributable the measurement is (not a promise of outcomes).
We only score a mention after we can tie it to the chosen entity (domain/identity match). No ambiguous wins.
What this can’t do (and why that’s fine).