On this page10 sections

What you need before starting the audit
Step 1: Build your 12-prompt audit set
Step 2: Run the prompts across ChatGPT, Claude, and Perplexity
Step 3: Score using the four-metric framework
Step 4: Fix the gaps with content and entity work
Step 5: Monitor monthly and adjust
Common mistakes to avoid
Frequently asked questions
Tools and resources
Next steps

Disclosure: Some links below are affiliate links. We may earn a commission at no cost to you. How we review.

Disclosure: Some links in this article are affiliate links. We may earn a commission at no extra cost to you.

The DIY ChatGPT and Perplexity Brand Audit: A 2026 GEO Methodology

A solo-friendly methodology to measure citation rate, sentiment, and share of voice across ChatGPT, Claude, and Perplexity in under three hours.

By Sara Mitchell, Marketing Analyst

Published April 2026 · · ● reading now

Gartner projects that 25% of organic search volume will move to AI chatbots and virtual agents by 2026. Run that against your own funnel. If your category receives 50,000 monthly branded and non-branded queries and 25% migrates to ChatGPT or Perplexity, that is 12,500 monthly sessions you stop owning unless your brand is cited inside the model output. At a 2% conversion rate and a $200 average order value, the math is roughly $50,000 of monthly revenue at risk per category. That is the number that justifies the next three hours of your time.

The second number that matters: SparkToro's 2024 click-stream data shows 65% of Google searches now end without a click. Layer AI Overviews on top, and the funnel narrows further. Brand visibility inside the LLM answer is the new featured snippet, except there is no SERP to bid on. You either get cited or you do not.

This guide is the methodology I run for my own clients to answer one question: does my brand show up in ChatGPT and Perplexity, and is the citation accurate? You will run 12 prompts across three engines, score four metrics, and walk away with a prioritized fix list. No paid tools required for the first pass.

What you'll learn

A 12-prompt audit set, a four-metric scoring framework (Citation Rate, Sentiment, Accuracy, Share of Voice), the exact fixes that move citations within 60-90 days, and a monthly monitoring cadence that scales from solo to paid tools like Profound and Otterly.

What you need before starting the audit

This is a methodology for operators who already have something to audit. If you are pre-launch, build the brand first. The audit assumes:

A live brand with at least 12 months of public content (website, social, third-party mentions)
A defined target market and a list of 3-5 named competitors you would expect to share answers with
A free or paid account on ChatGPT (GPT-4 or higher), Claude (Sonnet or Opus), and Perplexity (free works, Pro is better for sourcing)
A spreadsheet (Google Sheets, Airtable, or a Notion database if you want the audit log to live alongside your content calendar)
Roughly 2-3 hours of focused time per audit cycle

If you do not have named competitors yet, stop here. Open Perplexity, type "top [your category] tools 2026," and copy the first five names that appear. Those are your competitors whether you like them or not, because that is what the model is telling buyers right now. The same logic AI mode is reshaping search applies inside chat: the model has already decided your competitive set.

Step 1: Build your 12-prompt audit set

The 12-prompt set covers four categories with three prompts each. The categories are designed to surface different facets of brand visibility. Direct brand prompts test whether the model knows you exist and gets the facts right. Category recommendation prompts test whether you appear in unbiased "best of" answers. Problem-solution prompts test whether the model surfaces you when a buyer describes a pain point in their own words. Comparison prompts test how the model handles head-to-head positioning against your top competitors.

Replace the bracketed placeholders with your actual brand, category, and competitors before running.

DIRECT BRAND (3)
What is [Your Brand] and who is it for?
What are the pros and cons of using [Your Brand]?
How much does [Your Brand] cost and what’s included in each tier?
CATEGORY RECOMMENDATION (3)
What are the best [your category] tools for [target customer] in 2026?
Which [your category] platform has the best [key feature] for solopreneurs?
Recommend a [your category] solution under $[price ceiling] per month.
PROBLEM-SOLUTION (3)
I’m a [target persona] struggling with [specific pain point]. What tool should I use?
How do I [job-to-be-done that your product solves] without hiring an agency?
What’s the fastest way to [primary outcome your product delivers] for a small team?
COMPARISON (3)
[Your Brand] vs [Top Competitor 1]: which is better for [use case]?
Is [Your Brand] worth it compared to [Top Competitor 2]?
Should I switch from [Top Competitor 3] to [Your Brand]?

Tip: prompt as a buyer, not a marketer

Buyers ask messy, lowercase, half-formed questions. If your audit set reads like polished marketing copy, you are testing the wrong distribution. Run prompts 7-9 in the actual phrasing your customers use in support tickets and sales calls. The citation gap usually shows up in problem-solution prompts before anywhere else.

Step 2: Run the prompts across ChatGPT, Claude, and Perplexity

Open three browser tabs, one per engine. Run each of the 12 prompts in each engine, in fresh chat sessions, with no system prompt or memory bias. That is 36 total prompt runs. Budget 90 minutes if you record carefully.

For each run, record six fields in your spreadsheet:

Engine (ChatGPT, Claude, Perplexity)
Cited (yes/no): does your brand name appear in the answer body or citation list?
Position (1-N): if cited, where in the recommendation order?
Sentiment (positive/neutral/negative): how is the brand described?
Accuracy (correct/partial/wrong): are pricing, features, and positioning factually right?
Competitors mentioned (list): which competitor names appear in the same answer?

A few engine-specific notes. ChatGPT with browsing on pulls from Bing's index, so it favors SEO-strong pages. With browsing off, it pulls from training data, so it favors entity-strong brands with Wikipedia presence and structured data. Run each prompt twice in ChatGPT (browsing on, then off) and score separately. Claude does not browse by default in the free tier, so its citations are pure training-data signal. Perplexity always retrieves, and its sources are visible inline, which makes accuracy scoring trivial.

For competitor share of voice, count the number of times each competitor name appears across the 36 runs. If your top competitor shows up 22 times and you show up 4 times, that is your gap to close. The same dynamic applies to AI SEO tools that actually move citation rate: track competitor mentions, not just your own.

Step 3: Score using the four-metric framework

Aggregate the raw data into four headline metrics. These are the numbers you will track month over month.

Metric	Formula	Healthy Benchmark
Citation Rate	(Prompts where you appear / 36) x 100	40%+ for established brands, 15%+ for sub-$1M brands
Sentiment Score	(Positive mentions x 1) + (Neutral x 0) + (Negative x -1), divided by total mentions	+0.5 or higher
Accuracy Rate	(Correct mentions / Total mentions) x 100	85%+
Share of Voice	Your mentions / (Your mentions + Top 3 competitor mentions)	25%+ in a 4-brand set

Run the math. If your category has roughly 50,000 monthly queries shifting toward AI and your citation rate is 10%, you are visible in 5,000 of those. At a 2% conversion rate that is 100 conversions a month from AI surfaces. Push the citation rate to 30%, and the same traffic mix yields 300 conversions. At a $200 AOV, the delta is $40,000 of monthly revenue. That is the ROI math that justifies budget for the fix work in step 4.

Step 4: Fix the gaps with content and entity work

The audit produces a gap list. Map each gap to one of four fix categories.

Schema and entity work. Add Organization, Product, and FAQPage schema to your core pages. Get a Wikipedia entry if you qualify (notability is the bar, not vanity). Claim Wikidata. List your brand on Crunchbase, G2, Capterra, and Product Hunt with consistent NAP data. LLMs disambiguate entities by cross-referencing these sources. If you are missing from three of them, the model literally does not know who you are.

Third-party presence. The model rewards corroboration. One Reddit thread, two YouTube reviews, three LinkedIn posts from non-employees, and a podcast appearance will move citation rate inside 90 days. Pay for placement if you have to. A $500 podcast sponsorship that generates a transcript indexed in training data is cheaper than $500 of paid search at this point.

Comparison content. Publish dedicated comparison pages for each of your top three competitors. Format: H2 with the literal "You vs Competitor" phrasing, a feature table, a pricing table, a "who should pick which" callout. The Claude marketing playbook covers the prompt structures that surface comparison content most often.

Authority content engine. The brands that win citation rate publish 2-4 substantive pieces a month, not weekly thin posts. Beehiiv as a content engine for AI authority works because newsletter archives become public, indexable, and citable. Pair that with HubSpot for CRM-tracked attribution so you can prove which AI-sourced visits convert.

Warning: do not buy paid GEO tools before fixing fundamentals

Profound raised $20M in March 2026 and prices accordingly. Otterly, AthenaHQ, and Peec AI all launched or raised in Q1 2026 with similar pricing pressure. These tools monitor and optimize citation rate at scale, but they cannot fix a brand with no Wikipedia entry, no schema, and no third-party mentions. Spend the first 90 days on entity work. Layer paid tools on top once your manual citation rate is above 20%.

For paid options when you are ready: Semrush now ships an AEO tracker inside its standard plan, which means you do not pay extra if you already use it for SEO. Read the full Semrush review for feature specifics. Profound and AthenaHQ are purpose-built and worth it once you cross $10K MRR. Profound's $20M Series A went largely toward expanding their prompt-coverage index, which is the moat that matters for share-of-voice tracking.

Step 5: Monitor monthly and adjust

Pick the first business day of each month. Re-run the same 12 prompts in the same engines. Log the four metrics into your tracking sheet. Build a simple line chart of citation rate and share of voice over time. The trend matters more than any single data point.

Automate the boring parts. Use Make to fire a monthly reminder, pull your tracking sheet into a Slack digest, and timestamp a snapshot. The Make review covers the GPT-4 module that can run prompts headlessly if you upgrade beyond the free tier. Zapier works as a swap-in if your team already lives there. Store every audit cycle in Notion so you can compare month four to month one without digging through email.

Escalate to paid tools when one of three things happens: your citation rate plateaus for two consecutive months, your share of voice falls behind a single competitor by more than 15 points, or you cross $10K MRR and the four hours of monthly audit time costs more than the tool subscription. Quick benchmark: if your blended labor rate is $75 per hour and you spend four hours per audit cycle, that is $300 per month of opportunity cost. A $499 per month tool that saves three of those four hours pays for itself the moment your citation rate moves five points and a single AI-sourced lead converts at $200 AOV.

Common mistakes to avoid

Running prompts in your logged-in account with memory on. ChatGPT memory and Claude project context bias the answer. Use a fresh chat or incognito window every time.
Scoring once and calling it a baseline. Run each prompt three times and average. LLM outputs are non-deterministic, and a single run will mislead you on edge cases.
Ignoring negative or partial citations. A wrong pricing tier in a Perplexity answer is worse than no citation at all. Buyers act on what the model says. Track accuracy with the same rigor as citation rate.
Buying paid tools before fixing entity fundamentals. If you have no Wikipedia entry and no schema, no monitoring tool will save you. Sequence the work: entity, then content, then monitoring.
Treating GEO as separate from SEO. ChatGPT browsing pulls from Bing. Perplexity pulls from a hybrid index. Strong SEO is still a precondition for citation in retrieval-mode answers. Do not abandon your existing search work.
Skipping the share-of-voice math. A 30% citation rate looks healthy until you realize a single competitor is at 70%. Always score yourself against the named competitor set, not in isolation.

Frequently asked questions

How often should I re-run this audit?

Monthly for active categories, quarterly if your niche moves slower. LLM training cuts and retrieval indexes update on uneven cycles, so a 30-day cadence catches most drift before it costs you visits. If you push a major content release or PR hit, run an unscheduled spot audit two weeks later to see if citations moved.

Is a 12-prompt sample size statistically meaningful?

Twelve is the floor for directional signal, not academic certainty. Each prompt yields a binary citation outcome plus three qualitative scores, so you end up with 144 data points across three engines. That is enough to spot a 20-30% citation gap versus competitors. Scale to 30-50 prompts once you start spending on paid tools.

Should I pay for Profound, Otterly, or AthenaHQ instead of doing this manually?

Pay when manual time exceeds the tool cost. If your blended labor rate is $75/hour and the audit takes four hours monthly, that is $300/month of opportunity cost. Profound starts around $500/month, Otterly and Peec AI sit lower. Run manual audits for two cycles first so you know what good looks like before you delegate to software.

What if my brand has zero citations across all 12 prompts?

That is the most common starting state for sub-$5M brands. The fastest moves are: claim and edit your Wikipedia entity, publish a comparison page targeting your top three competitors, get cited in two third-party listicles, and post one technical Reddit answer in a relevant subreddit. Expect first citations within 60-90 days.

Does ChatGPT browsing change citation behavior versus the base model?

Yes, materially. Browsing-enabled answers pull from live retrieval and surface citations from Bing's index, so SEO-strong pages win. Base model answers pull from training data, so entity-strong brands win. Run each prompt twice, once with browsing on and once off, and score them separately.

Tools and resources

Semrush for AEO tracking bundled inside an existing SEO subscription
Notion as the audit log and monthly snapshot archive
HubSpot for CRM tracking of AI-sourced conversions
Beehiiv as a content engine that builds AI-citable authority
Make for automating monthly re-audits and Slack reporting
Zapier as a workflow alternative if your stack already runs there

Next steps

Block three hours on your calendar this week. Run the 12-prompt audit. Log the four metrics. Pick the single biggest gap and ship one fix in the next 14 days. Re-audit on day 30. The compounding starts when you stop debating the methodology and start running the loop. Browse the full Ea-Nasir AI tools directory for the platforms that show up most often in citation-rate wins.

Keep solving the same bottleneck

AnalyticsAcquisition

Next step: Compare the best AI marketing tools →

DISPATCH

Weekly Newsletter

The stack breakdown, delivered.

One email per week. Real tool reviews, what's worth the money, and what to skip.

Subscribe Free →

DECISION AID

For the overwhelmed operator

Not sure which tools are right for you?

Answer four quick questions and receive a personalized stack recommendation. Ninety seconds, no signup.

Get My Recommendation →

· four questions · personalized picks · zero fluff