How to monitor brand visibility in AI tools (and what the data actually tells you)

2026-02-25 · Sofian Bettayeb

Monitoring brand visibility in ChatGPT, Claude, and Perplexity is now a real part of brand management. Here's how to do it, what the numbers mean, and why the honest limitations of AI monitoring tools don't make them less useful.

Brand monitoring used to mean Google Alerts, social listening, and the occasional media scan. That's still useful. But there's a growing channel where brands are either present or absent, and most brand managers have no visibility into it: AI answers.

When someone asks ChatGPT "which accounting software is best for freelancers," they get a curated answer. That answer either includes your brand or it doesn't. It either describes your product accurately or it doesn't. There's no way to know which unless you're tracking it.

Here's how to track it properly, and what the data actually means.

Why monitoring matters now, not later

AI models update. This is the part that catches most people off guard.

A model that cites your brand favorably in December may be retrained by March with different training data. Competitor content that's published and cited widely in February can shift how your brand is represented by April. Citations aren't permanent placements. They're signals that change as the underlying data changes.

This is why a one-time audit isn't enough. Start with a free AI visibility audit to establish a baseline, then build ongoing monitoring on top of it to catch shifts when they happen, understand what's driving them, and act before the gap compounds.

The brands doing this well right now are the ones that will have a meaningful head start when AI search becomes a normalized part of how every organization measures its presence. The ones who wait until it's obviously important will spend months catching up on a moving target.

What to track and where

The three platforms that matter most for most brands right now are ChatGPT, Claude, and Perplexity. They use different underlying models, pull from different data sources, and weight citations differently. A brand that appears consistently in Perplexity may appear rarely in ChatGPT. Tracking just one gives you an incomplete picture.

For each platform, you're tracking four core metrics. For a full breakdown of what each means and which are directly tied to revenue, see AEO metrics that actually matter.

Visibility Score. What percentage of your target queries does the brand appear in? This is the headline metric. It compresses the complexity of AI visibility into a number that's easy to track over time and easy to explain to a client or stakeholder.

Sentiment. When the brand appears, how is it described? Positive, neutral, or concerning? Neutral is often the silent problem. A brand described as "one option to consider" when competitors are described with specifics and enthusiasm has a sentiment gap that's just as real as an absence.

Citation position. First recommendation, secondary option, or brief mention. This matters because AI users often act on the first recommendation without reading further. Being present but consistently third changes the effective value of that visibility.

Competitive Share. Out of all brand mentions in your tracked queries, what percentage goes to you versus competitors? This is the context that makes Visibility Score meaningful. 20% sounds different when you're the market leader than when your main competitor has 65%.

How to run the tracking manually

For a small number of clients or a single brand, manual tracking is feasible. The process:

Write a list of 15 to 25 target queries. These are the questions your potential customers actually ask when they're evaluating options in your category. For a data-driven approach to building this list from GSC, Reddit, and sales call transcripts, see How to create your first AEO topics and prompts. Not brand name lookups. Category queries: "best tool for X," "compare X and Y," "who provides X for Z."

Run each query in ChatGPT, Claude, and Perplexity. Record whether the brand appears, in what position, with what sentiment.

Do this twice per run, on different days, to account for response variability. AI answers aren't deterministic. The same prompt can produce different outputs at different times.

Log the results in a spreadsheet. Calculate Visibility Score and Competitive Share. Note any changes from the previous month.

This takes around 3 to 4 hours per client per month for 25 queries across 3 platforms. It's manageable for one client. For five clients on a monitoring retainer, it's 15 to 20 hours of work every month before you've done anything with the data.

Why manual tracking doesn't scale

The practical ceiling for manual AI monitoring is around 2 clients before it starts affecting margin or quality. At that point, you're either undercharging to keep it affordable or overcharging for work that's mostly repetitive data collection.

A proper monitoring tool automates the prompt runs, records results across platforms, and calculates the metrics automatically. What's left for you is interpretation and client communication. Those are the parts where your judgment actually adds value.

AEO Copilot's Topics and Prompts features let you organize target queries by client, track Visibility Trend week over week, and see Competitive Share alongside Sentiment Score without running anything manually. The Visibility Trend chart shows direction of change, which is often what a client wants to see more than a raw number.

The honest part: what monitoring can't tell you

I wrote about this in more depth in a LinkedIn article on AEO tools limitations. Here's the condensed version.

AI answer engines use stochastic token prediction, not deterministic ranking. There are real limitations that no monitoring tool fully solves:

Inconsistent answers. The same prompt produces different outputs across sessions. Monitoring averages across runs to reduce noise, but variability is real.

No native analytics. Unlike Google Search Console, AI platforms don't provide impression or click data. Everything we measure is inferred from prompt sampling.

Personalization effects. Some AI platforms personalize responses based on user history. What one user sees may differ from what another sees asking the same query.

Model variation. Different model tiers (GPT-4o versus GPT-4o mini, for example) and different modes (web browsing on or off) produce different outputs for the same prompt.

Localization. A brand visible in English-language responses may be absent in French or German responses for the same query.

Data source opacity. We can see what AI models say, but not which sources they pulled from to construct the answer. The path from content to citation is genuinely opaque.

As I put it when writing that piece: "AEO tools don't give certainty. They give direction." That framing matters. Monitoring tells you whether visibility is trending up or down, whether sentiment is shifting, whether a competitor is gaining share. That's actionable. It's just not a precision instrument.

What the data should trigger

Good monitoring data answers three questions on a recurring basis:

Is overall visibility moving? If Visibility Score drops significantly over a month, something changed. Either the brand's presence in AI training data shifted, competitors published more authoritative content, or the model updated and weighted signals differently. The drop is the signal to investigate.

Is sentiment changing? A positive to neutral shift on a key product query usually points to a content problem: the model doesn't have enough specific, favorable information to draw from. The fix is creating it.

Who is gaining? If Competitive Share drops while Visibility Score holds steady, competitors are appearing more often in the same answers. That's a different problem from pure absence, and requires a different response.

Monthly monitoring reports built around these three questions are easy to explain, easy to act on, and easy to justify as a retainer service.

Turning monitoring into a retainer

The monitoring retainer is the most natural revenue model for this service. The work is recurring, the tool does the data collection, and the client value compounds over time as you accumulate trend data.

For agencies, the pitch is: you're not just delivering a monthly report. You're building an ongoing picture of how the brand is perceived by AI systems, and acting on it before it becomes a problem.

That picture, accumulated over a year, is genuinely valuable. It shows what changed, when, what you did about it, and whether it worked. Few agencies can offer that kind of documented, measurable history on an emerging channel.

For more on the full audit methodology that precedes a monitoring engagement, see AI visibility audit for agencies. For how to price and structure the retainer, see How agencies sell AI visibility.

For how agencies operationalize monitoring as a scalable service with the right tool, see AI visibility tool for agencies.