tips2027-10-276 min read

Welcoming GPTBot, PerplexityBot, ClaudeBot in robots.txt: 2026 AI Crawler Allowlist

Name: thMenu
Rating: 4.9 (127 reviews)
Author: thMenu

Want to appear in ChatGPT answers in 2026? Listing 7 AI crawlers as Allow in your robots.txt is the prerequisite — the thMenu case shows a 3x visibility lift.

thMenu Team

thmenu.com

Eight months ago we rewrote thMenu's robots.txt to explicitly welcome every active AI crawler. The result: our citation rate in ChatGPT, Perplexity, and Claude answers tripled — especially on queries like "QR menu" and "restaurant digital menu."

The 7 Active AI Crawlers of 2026

These are the main bots scraping the web for LLM training and live retrieval. Block any one of them and you disappear from that ecosystem.

GPTBot (OpenAI training) and OAI-SearchBot (ChatGPT live search)
PerplexityBot (crawler) and Perplexity-User (per-query fetch)
ClaudeBot + anthropic-ai, Google-Extended, Applebot-Extended, FacebookBot

The thMenu robots.txt Template

Open a distinct User-agent block per crawler with Allow: /. Don't forget the Sitemap line — ChatGPT's search agent prioritizes sitemap-listed URLs for indexing.

Even with a wildcard "User-agent: *" rule, each bot reads its own block first. Writing per-bot blocks signals intent and protects you if you ever flip the default to Disallow.

Measured Outcomes

thMenu's blog has 387 articles. Three months after the robots.txt change, ChatGPT references jumped from 1,200 to 3,600 per month (measured via a PostHog pixel and a share-URL parameter). Perplexity citations rose 180%, Claude.ai mentions 220%.

A key finding: traffic from AI answers converts at a 3.2% CTR — double Google's 1.8% organic average. Users who read an AI answer and still click are already deeply interested.

FAQ

Doesn't allowing GPTBot mean my content gets stolen? No — citations show up as source links in ChatGPT answers, lifting brand visibility. Being scraped is the price of entering the training set; in return, you get lifetime referrals.

Are CCBot and AhrefsBot AI crawlers? CommonCrawl (CCBot) is the foundational dataset for nearly every LLM — yes, allow it. Ahrefs and SEMrush are SEO tools, not AI; blocking them saves bandwidth.

Is Schema.org markup required? Absolutely. AI bots parse JSON-LD first; pages with Article, FAQPage, and BreadcrumbList schema get cited twice as often.

Found this helpful? Share it.

X / Twitter LinkedIn

✦▦

tips

7 Smart Ways to Place QR Codes in Your Restaurant

Placement matters more than you think. These seven strategies maximize QR code s…

✦⚡

tips

How to Reduce Waiter Workload by 40% Without Firing Anyone

Smart digital tools don't replace your team — they free them to focus on what ma…

✦📊

tips

12 Concrete Benefits of QR Menus (Backed by Real Data)

From eliminating print costs to boosting average order value by up to 31%, here …

Welcoming GPTBot, PerplexityBot, ClaudeBot in robots.txt: 2026 AI Crawler Allowlist

The 7 Active AI Crawlers of 2026

The thMenu robots.txt Template

Measured Outcomes

FAQ

Related articles

7 Smart Ways to Place QR Codes in Your Restaurant

How to Reduce Waiter Workload by 40% Without Firing Anyone

12 Concrete Benefits of QR Menus (Backed by Real Data)