Methodology
Every night for 60 days, our Python pipeline (the same 10,000-query/night scraper that powers our SERP intelligence) fired 4,200 distinct prompts at four AI assistants: gpt-4o via the OpenAI API, claude-3.7-sonnet via Anthropic's API, perplexity-sonar-pro via Perplexity's API, and gemini-2.5-pro via Google's API. Each prompt was sent both with and without web-search grounding where the engine supported the toggle.
Verticals: SaaS, fintech, e-commerce, hospitality, healthcare, education, real estate. Languages: 40% English, 60% Thai, reflecting the markets we serve. We logged the full response, parsed citations (URL, domain, position in answer), and joined the dataset against our existing Google SERP scrape so we could compare AI citations against organic rankings for the same queries.
Total dataset: 378,000 cited URLs, deduped to ~24,000 unique domains. The 5 findings below are what survived a re-run of the analysis on a held-out 8-week sample we collected in March-April 2026 — meaning these patterns held for at least 4 months across model updates.
Finding #1: Citation rate correlates with specificity, not authority
Domain Authority (or any equivalent backlink-based metric) is a weak predictor of AI citations. Specificity of claims is a strong predictor. Pages that contain explicit numerical assertions get cited 3.2x more often than pages making the same point with hedged language, after controlling for domain authority.
| Page type | Cited per 100 prompts |
|---|---|
| Generic listicle ("top 10 X") | 1.4 |
| Branded thought-leadership (no numbers) | 2.1 |
| Article with 1-2 numerical claims | 4.8 |
| Article with 5+ numerical claims, sourced | 11.3 |
| Original research with proprietary data | 23.7 |
The mechanism is intuitive: a generative model is reward-shaped to commit. A page that hedges ("can lead to," "may help," "often") provides nothing the model can drop into an answer. A page that says "reduces churn by 23% for SaaS companies with ARR <$5M, n=124" becomes the source of a number the model will use verbatim. The number creates the citation hook.
"If you can't replace 'often' with a number plus a sample size, your page is competing with 80% of the internet for citations — and losing."
Finding #2: Schema is necessary but not sufficient
71% of cited domains had Article, FAQPage, or HowTo schema. But — and this is critical — every domain in the top 100 of any vertical has schema now. Schema gets you parsed; what determines citation is structural clarity within the parsed content.
The structural pattern that wins: answer-first paragraphs. The H2 phrases the question your reader has, the first paragraph (1-2 sentences) answers it cleanly, and the supporting detail follows. This is the inverse of the typical brand blog structure ("In today's fast-paced world, businesses are increasingly..."), which buries the answer 3 paragraphs deep.
We covered the schema-deployment side in detail in our multi-domain schema graph article — the cluster signal compounds with the per-page structure.
Finding #3: Thai-native content dominates Thai-language prompts
For Thai-language prompts, 78% of citations went to natively-authored Thai pages. Translated pages — even from .com domains with strong global authority — lost to small Thai-language competitors with one-tenth the backlinks. The signal isn't just translation quality; it's register: do you use the casual particles Thais actually use online (นะคะ, ครับ, แหละ)? Are your examples local (PromptPay, BTS, 7-Eleven) or generic? Do prices come in Thai-relevant ranges (฿ not USD)?
For multinational brands operating in Thailand, this is brutal. The default workflow is to write in English and translate. The data says: kill that workflow. Hire Thai writers, brief them in English, let them author natively. Even cheap native authoring beats good translation. Our partner SitPlay Media handles native Thai authoring as a default — it's why we co-deliver with them rather than try to handle content in-house.
Finding #4: Citations cluster in 8-15 domains per topic
For any given topic — say, "SaaS pricing strategy Thailand" — the four AI engines collectively cite from a small set of domains. Across our 4,200-prompt set, the median topic had 11 distinct domains in the citation pool. Topics with 20+ citation domains were rare and almost always commodity verticals (e.g., generic "what is X" definitions where many encyclopedic sources qualify).
| Vertical | Median domains cited | Top-1 domain share |
|---|---|---|
| SaaS | 9 | 23% |
| Fintech | 13 | 18% |
| E-commerce | 14 | 16% |
| Hospitality | 11 | 21% |
| Healthcare | 8 | 31% |
| Education | 10 | 26% |
Translation: AEO is winner-take-most. Once you're in the cluster, you compound — every time the model picks you, your weight in future training (or grounding) increases. Once you're out, you're invisible. This concentration is tighter than what we see in organic search rankings, where the long tail can pull traffic for niche queries. AI assistants don't have a long tail; they have a head and silence.
Finding #5: Freshness matters more than 2025 papers suggested
For time-sensitive topics (pricing, regulations, "best of 2026," product comparisons), AI engines de-weight content older than ~14 months. We see this most clearly in Perplexity, which heavily uses recency as a ranking signal, but it's also present in ChatGPT and Gemini grounded responses.
| Content age | Citation rate (relative to fresh) |
|---|---|
| 0-3 months | 1.00x |
| 3-9 months | 0.91x |
| 9-14 months | 0.74x |
| 14-24 months | 0.42x |
| 24+ months | 0.18x |
The implication: refresh schedules need to be in your CMS, not run as quarterly projects. Tag articles with a review_due field on publish. Route reviews through the editorial calendar automatically. Treat your top-20 articles like products with maintenance budgets.
Finding #6: Forum and community content punches above its weight
For long-tail Thai prompts, Pantip and Reddit threads were cited at 2.4x the rate their domain authority would predict. Why? Forums encode E-E-A-T signals — real users, dated discussions, contradiction and resolution — that brand pages systematically lack. They also have the structural pattern AI engines reward: a question, an answer, follow-ups that refine the answer.
You can't easily compete with Pantip on its own turf, but you can borrow the form: real customer questions, dated answers, named experts, contradictory evidence acknowledged and resolved. Brand pages that mimic this form get cited 2-3x more than brand pages that don't.
Finding #7: Engine-specific quirks
The four engines aren't interchangeable. Each has tendencies worth knowing:
- ChatGPT with browsing weights brand recognition heaviest. Big-brand sources beat better-data smaller sources roughly
55/45. - Claude rewards specificity and academic-style sourcing the most. The closest of the four to "best content wins."
- Perplexity is the most recency-biased. Articles >6 months old get cited
~40%as often. - Gemini heavily favors Google-property sources and has a measurable bias toward Wikipedia (
9.2%of all citations land on a Wikipedia URL).
For Bangkok brands, the practical takeaway is: optimize for Claude and Perplexity first (they're the most "merit-based"), and ChatGPT will follow over time as your content earns brand mentions in newer training cuts. Gemini is the hardest to influence directly; the best lever is structured data and Wikipedia notability if you can earn it.
What we're doing differently as a result
- Every brief now requires a numerical claim per H2. No "data coming." No "research suggests." Specific number with sample size, or it doesn't ship.
- We've killed translation as a default workflow. Thai content is authored in Thai; English content is authored in English. No exceptions.
- Refresh cycles dropped from 18 months to 9. Top-performing articles get a quarterly micro-update.
- Schema is mandatory but no longer celebrated. The bar moved.
- We're publishing more original-data drops (like this article). Proprietary numbers are the highest-value citation bait we've found — and it lines up with our existing scraper-driven workflow.
What this means for your 2026 plan
If you're treating AEO as "SEO with extra steps," you're going to keep losing. The disciplines overlap, but the citation surface is narrower (8-15 domains per topic), the specificity bar is higher (numbers beat adjectives every time), and the language sensitivity in Thai is brutal.
Three actions for the next quarter:
- Audit your top 20 pages against the specificity test. Replace every "many," "often," "significantly," "most," "typically" with a number plus a source — or kill the sentence. We do this audit as part of every retainer; it's also available as a one-shot via our technical SEO services.
- Run the schema graph across your domains if you have more than one. The 3-rule system took us 7 weeks to deliver a 3.4x AEO citation lift on our own cluster.
- Tag articles with a review-due date in your CMS. Refresh top performers quarterly. Articles >14 months old without updates fall off the citation cliff.
The infrastructure side — fast pages so AI crawlers actually fetch your content, clean log signals so Googlebot doesn't waste budget — is covered in our INP guide and log-forensics writeup. If you're scaling content programmatically, the quality-gate system keeps you out of manual-penalty territory while you grow.
What we don't believe (yet)
A few claims floating around in the AEO discourse that our data doesn't support:
- "You need to publish on Wikipedia." Helpful but not necessary;
~60%of cited domains in our dataset have no Wikipedia presence. - "AI prefers long-form content (3,000+ words)." Citation rate plateaus around
1,200-1,800 words. Beyond that, marginal returns. - "You should publish in JSON or structured-data-only formats." No measurable lift, sometimes hurts because human-readable context disappears.
- "AI engines penalize affiliate content." They don't. They penalize thin content. Affiliate articles with original testing get cited normally.
If you want a 30-minute review of your top 20 pages against the framework above — citation eligibility, specificity score, freshness debt — email us. We'll tell you honestly which pages are citation-eligible and which need to be rewritten or killed.
Related reading
This article is the AEO half of our quarterly research output. The technical-SEO half lives in the schema graph guide, the INP optimization guide, and the log-forensics writeup. The scaling discipline is in our programmatic-quality system. Together they're the engineering stack we ship to every client. SitPlay handles the editorial side; Bluewich handles the dev side; Bangkok Digital handles CRO testing post-launch.
aeo chatgpt claude perplexity gemini citation-analysis generative-search