Measurement Protocol
This page documents what Valoh measures, how it collects each signal, and what it deliberately does not do. It's intended to be readable by a measurement lead, an MRC-track auditor, or a CFO asking what they're paying for.
What we measure
Valoh tracks five signals about how a brand appears in conversational AI responses: mention rate, rank position, sentiment, competitor citations, and recommendation strength. These are the same signals AEO tools surface — Valoh's contribution is independence, transparent collection, and auditable retention of the underlying transcripts.
Every reported value can be traced back to a logged conversation. Nothing in the report is inferred; everything is observed.
Prompt Construction
For each measured category, Valoh constructs a prompt set up front. A prompt set contains 40–60 representative consumer queries spanning awareness, comparison, and purchase-intent stages. Once defined, the set is locked for the measurement window.
Model Coverage
| Model | Provider | Access | Daily queries / brand |
|---|---|---|---|
| ChatGPT | OpenAI · GPT-class flagship | API, default settings | ~50 |
| Gemini | Google · flagship | API, default settings | ~50 |
| Claude | Anthropic · flagship | API, default settings | ~50 |
| Perplexity | Perplexity · default model | API, default settings | ~50 |
Each model is queried independently. We do not average results across models; presence in ChatGPT and presence in Perplexity are different facts and are reported as such.
Sampling & Frequency
Sampling is calibrated to produce a 95% confidence interval on mention rate at ±3% per model per category. Sample size scales upward where category breadth or low base rates demand it.
Signal Definitions
| Signal | Definition | Unit |
|---|---|---|
| Mention Rate | The percentage of category-anchored prompts in which the brand is named anywhere in the response. | % / day |
| Rank Position | When mentioned in an enumerated list, the average ordinal position of the brand within that list. | avg rank |
| Sentiment | Polarity of the language used to describe the brand within its surrounding sentences, scored on a continuous −1 to +1 scale. | −1 to +1 |
| Competitor Citations | The set of named competitor brands appearing in the same response as the measured brand, with frequency. | named set |
| Recommendation Strength | Tier 1 (actively recommended) → Tier 4 (mentioned but not recommended), classified by a defined rubric over the recommendation phrasing. | tier 1–4 |
Validation
Mention rate, rank position, and competitor citations are extracted programmatically and have inter-rater agreement of effectively 1.0 by construction. Sentiment and recommendation strength are subjective signals; both are double-coded — once by an automated classifier and once by a human reviewer on a stratified sample. Disagreements are reviewed and resolved.
What Valoh deliberately does not do
We do not tell you what to write, what to publish, how to structure your content, or what to ask the model. We tell you what the model said.
Known Limitations
Early Access