Your sitemap is configured. Your Core Web Vitals score is green. Your product catalog is perfectly structured. And yet when a user asks ChatGPT for products you sell, your store doesn’t appear.

Most of the time, the reason is a single file: robots.txt.

Specifically — a robots.txt written for Google in 2019 and never updated for the AI crawlers that now determine your visibility in ChatGPT, Gemini, Claude, and Perplexity. In 2026, there are ten distinct AI bots across four platforms. Most Magento installations have explicit rules for zero of them.

How to Fix robots.txt for ChatGPT and Gemini in Magento 2
How to Fix robots.txt for ChatGPT and Gemini in Magento 2

Why robots.txt Is AEO Signal #1

The angeo/module-aeo-audit checks robots.txt first and marks it Critical because it is a gate. Every other AEO signal — llms.txt, Product schema, AI product feed — is irrelevant if the AI crawler cannot reach your store in the first place.

OpenAI states this without ambiguity:

“Sites that are opted out of OAI-SearchBot will not be shown in ChatGPT search answers.”

Not “may not appear.” Will not appear. If OAI-SearchBot is blocked — by an explicit Disallow or caught in a wildcard rule — your store is excluded from ChatGPT search answers regardless of everything else you do.


The Three Types of AI Bots — Why the Difference Matters

Before listing every bot, you need to understand what each one actually does. AI crawlers fall into three distinct categories with very different purposes — and conflating them causes the most common robots.txt misconfiguration.

Type What it does Examples Ecommerce recommendation
Search & indexing bots Builds the live index used when users ask AI questions. Cites sources, links back to your store. Directly drives product discovery. OAI-SearchBot, Claude-SearchBot, PerplexityBot, Google-Extended Always allow
User-initiated fetchers Fetches your page when a specific user asks AI to visit a URL directly. May cite your product page in the response. ChatGPT-User, Claude-User, Perplexity-User Always allow
Training crawlers Collects content to train future AI models. No attribution, no direct traffic back to your store. GPTBot, ClaudeBot, Applebot-Extended Your choice

The most common mistake: blocking GPTBot (training) while believing it removes you from ChatGPT search results. It does not. GPTBot and OAI-SearchBot are entirely separate bots with separate purposes. Blocking training crawlers has zero effect on AI search visibility — but blocking search crawlers makes you invisible immediately.


Every AI Bot That Matters for Magento in 2026

Bot Platform Type Impact if blocked
OAI-SearchBot ChatGPT / OpenAI Search index Invisible in all ChatGPT search answers and product recommendations. Most critical.
GPTBot ChatGPT / OpenAI Training Content excluded from future GPT training data. Does not affect current search visibility.
ChatGPT-User ChatGPT / OpenAI User-initiated ChatGPT cannot fetch your pages when a user requests them directly.
Claude-SearchBot Claude / Anthropic Search index Invisible in Claude’s real-time web search answers.
ClaudeBot Claude / Anthropic Training Content excluded from future Claude training data.
Claude-User Claude / Anthropic User-initiated Claude cannot fetch your pages when a user requests them directly.
PerplexityBot Perplexity Search index Invisible in Perplexity answers and product recommendations.
Perplexity-User Perplexity User-initiated Perplexity cannot fetch your pages for direct user requests.
Google-Extended Gemini / Google Search index + training Not cited in Gemini AI Overviews or Google Shopping AI features.
Applebot-Extended Apple Intelligence Training Content excluded from Apple Intelligence training data.
anthropic-ai Anthropic Deprecated Legacy name for ClaudeBot. Keep rules for backwards compatibility.

Anthropic expanded to three bots in early 2026. Sites that only reference ClaudeBot in robots.txt are now missing Claude-SearchBot (live search) and Claude-User (user-initiated fetching). If your robots.txt was last updated before 2026, this almost certainly applies to your store.


The Default Magento robots.txt Problem

Magento’s default robots.txt starts with a wildcard block:

User-agent: *
Disallow: /index.php/
Disallow: /*?
Disallow: /checkout/
Disallow: /app/
...

This wildcard establishes a baseline that every bot inherits. If your deployment script, hosting provider, or a staging migration has added Disallow: / anywhere — AI bots are caught in it silently, with no error logged anywhere.

Check this right now. Open https://yourstore.com/robots.txt in a browser and look for Disallow: / on its own line. If it exists without an explicit Allow: / for each AI bot listed above it — every one of those bots is blocked. This affects the majority of Magento stores checked by the AEO audit module.


Where Magento Stores robots.txt — Two Scenarios

Before editing, identify which method your store uses to serve the file. Editing the wrong one has no effect.

Scenario A: Magento Admin (most common)

Magento can serve robots.txt dynamically from the database. Check whether this is active:

# Check if Magento manages robots.txt
bin/magento config:show design/search_engine_robots/default_robots

If it returns a value, Magento owns the file. Edit it via: Content → Design → Configuration → [Store view] → Edit → Search Engine Robots → Edit custom instruction of robots.txt.

Scenario B: Static file in pub/

If Magento Admin changes don’t show up at yourstore.com/robots.txt, a physical file is taking precedence. Check for it:

# Check if a static file exists and is being served
ls -la /var/www/html/pub/robots.txt
curl -I https://yourstore.com/robots.txt
# If no X-Magento headers appear, the file is served statically

Edit pub/robots.txt directly, or remove it to let Magento’s Admin configuration take over.

Multi-store installations: Each store view can have its own robots.txt in Magento Admin. If you run multiple stores on different domains or subdomains, configure each one separately under Content → Design → Configuration → [select store view]. Do not assume one configuration covers all stores.


The Complete robots.txt Configuration for Magento 2

This is the recommended configuration for Magento ecommerce stores in 2026. It explicitly allows all AI search and indexing bots, gives you a clear choice on training crawlers, and keeps Magento-specific paths protected.

# ============================================================
# AI SEARCH & INDEXING BOTS — Allow (critical for visibility)
# These build the index ChatGPT, Claude, Gemini, and Perplexity
# use to answer product discovery questions.
# Blocking these makes your store invisible in AI search answers.
# ============================================================
User-agent: OAI-SearchBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

# ============================================================
# USER-INITIATED FETCHERS — Allow
# Fetch your pages when a user directly asks AI about a URL.
# ============================================================
User-agent: ChatGPT-User
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Perplexity-User
Allow: /

# ============================================================
# TRAINING CRAWLERS — your choice
# These collect content for model training. No direct traffic
# back, no attribution. Blocking them does NOT affect search
# visibility. Change Allow to Disallow if you prefer to opt out.
# ============================================================
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: Applebot-Extended
Allow: /

# ============================================================
# TRADITIONAL SEARCH ENGINES
# ============================================================
User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

# ============================================================
# ALL OTHER BOTS — Standard Magento rules
# AI bots above are explicitly allowed before this wildcard.
# ============================================================
User-agent: *
Allow: /

# Magento paths — block from all crawlers
Disallow: /admin/
Disallow: /adminhtml/
Disallow: /api/
Disallow: /rest/
Disallow: /graphql
Disallow: /cron.php
Disallow: /var/
Disallow: /lib/
Disallow: /dev/
Disallow: /index.php/
Disallow: /*?SID=
Disallow: /*?___store=
Disallow: /checkout/
Disallow: /customer/
Disallow: /wishlist/
Disallow: /review/

# ============================================================
# SITEMAPS — helps all crawlers discover your pages
# ============================================================
Sitemap: https://yourstore.com/sitemap.xml
Sitemap: https://yourstore.com/llms.txt

Order is not optional. robots.txt uses first-match semantics per crawler. A bot reads the file top to bottom and stops at the first User-agent block that matches it. If User-agent: * with Disallow: / appears before the AI bot entries, those AI bots are permanently blocked — the rules below are never reached. The AI bot entries must always appear before the wildcard block.


How to Update robots.txt in Magento Admin

01
Open Design Configuration
Log in to Magento Admin. Navigate to Content → Design → Configuration.
02
Select your store view
Click Edit next to the store view you want to configure. For multi-store setups: repeat for each store view separately.
03
Open Search Engine Robots
Scroll to the Search Engine Robots section and expand it.
04
Paste the configuration
In “Edit custom instruction of robots.txt file”, paste the complete configuration from above. Replace yourstore.com with your actual domain in the Sitemap lines.
05
Save and flush cache
Click Save Configuration. Then run: bin/magento cache:flush
06
Verify the output
Open https://yourstore.com/robots.txt in a browser and confirm OAI-SearchBot, Claude-SearchBot, and PerplexityBot all have Allow: /. If the file hasn’t changed, Scenario B above applies — a static file is overriding the Admin configuration.

How to Update robots.txt via SSH

If Magento Admin is not managing the file, or if you prefer a direct file edit:

# SSH into your server
ssh user@yourserver.com

# Navigate to Magento pub directory
cd /var/www/html/pub

# Backup existing file
cp robots.txt robots.txt.backup.$(date +%Y%m%d)

# Edit the file
nano robots.txt
# Paste the configuration, save with Ctrl+O → Enter → Ctrl+X

# Verify it's live
curl https://yourstore.com/robots.txt | grep -E "OAI-SearchBot|Claude-SearchBot|PerplexityBot"

Four Mistakes That Block AI Bots in Magento

Mistake 1 — Disallow: / left on from a staging environment

Staging environments use Disallow: / to prevent Google indexing. This is frequently copied to production during deployments and never removed.

Where it comes from:

  • Magento Admin: Stores → Configuration → General → Design → Search Engine Robots
  • Static file: Manually edited pub/robots.txt from a staging copy
  • Deployment scripts: CI/CD pipelines that sync the full staging filesystem to production
  • Hosting provider defaults: Managed hosts (Hypernode, Nexcess, Cloudways) sometimes apply restrictive defaults on new environments
# Quick check — if this returns output, you have a problem
curl -s https://yourstore.com/robots.txt | grep -n "^Disallow: /$"

Mistake 2 — Wildcard block placed before AI bot rules

# WRONG — wildcard fires first, everything below is ignored
User-agent: *
Disallow: /

User-agent: OAI-SearchBot
Allow: /  # never reached — bot already matched the wildcard above

# CORRECT — explicit AI rules before the wildcard
User-agent: OAI-SearchBot
Allow: /

User-agent: *
Disallow: /checkout/

Once a bot matches a User-agent block, it stops reading. The Allow: / for OAI-SearchBot placed after User-agent: * is never evaluated.

Mistake 3 — Using outdated Anthropic bot names

# Deprecated — these names no longer reflect Anthropic's bot infrastructure
User-agent: Claude-Web    # retired 2024
User-agent: Anthropic-AI  # retired 2024

# Current names (2026)
User-agent: ClaudeBot        # training
User-agent: Claude-SearchBot # live search index
User-agent: Claude-User      # user-initiated fetching

Stores that added Anthropic rules in 2024 and haven’t updated since are missing Claude-SearchBot entirely — the bot responsible for Claude’s real-time web search answers.

Mistake 4 — robots.txt served as a cached or static file

Some infrastructure configurations bypass Magento completely when serving robots.txt:

  • Nginx static rule: serves pub/robots.txt before the request reaches Magento
  • CDN cache: Cloudflare or Fastly cache the old file with a long TTL
  • Varnish: cached response served without hitting the application

Diagnosis:

# If Magento headers are absent, the file is served statically
curl -I https://yourstore.com/robots.txt
# Look for: X-Magento-Cache-Control, X-Magento-Tags, or similar

Fix for Nginx: ensure this location block is present in your server config:

location = /robots.txt {
    try_files $uri $uri/ /index.php$is_args$args;
}

Fix for Cloudflare: purge the cache for /robots.txt via the Cloudflare dashboard, or set a Cache Rule to bypass caching for that path.


How to Verify the Fix

Option 1 — Manual curl check

# Check all critical AI search bots have Allow: /
curl -s https://yourstore.com/robots.txt | grep -A1 -E \
  "OAI-SearchBot|Claude-SearchBot|PerplexityBot|Google-Extended"

# Check server logs for AI crawler activity (confirms bots are visiting)
grep -Ei "OAI-SearchBot|Claude-SearchBot|PerplexityBot|Google-Extended" \
  /var/log/nginx/access.log | tail -20

Option 2 — AEO audit module (checks all bots + order validation)

# Install once, run any time
composer require angeo/module-aeo-audit
bin/magento setup:upgrade
bin/magento angeo:aeo:audit

✓ PASS  robots.txt — AI Bot Access
        OAI-SearchBot ✓  Claude-SearchBot ✓  ChatGPT-User ✓
        ClaudeBot ✓  Claude-User ✓
        PerplexityBot ✓  Google-Extended ✓

The audit checks not just whether each bot is listed, but whether the order of rules is correct — validating that AI bot entries appear before any wildcard Disallow directives.

Checking multiple stores

# Run per store view for multi-store installations
bin/magento angeo:aeo:audit --store=de
bin/magento angeo:aeo:audit --store=fr
bin/magento angeo:aeo:audit --store=en

After robots.txt — What’s Next

robots.txt is the access layer — it gets bots into your store. Once AI crawlers can reach your pages, the signals that determine whether you actually appear in AI product answers are:

  • llms.txt — a curated, machine-readable index of your store structure and top products, formatted for LLMs. Install with angeo/module-llms-txt.
  • Product JSON-LD schema — structured markup that tells AI exactly what your products cost, what they are, and where to buy them. Checked by angeo/module-aeo-audit.
  • AI product feed — structured CSV/JSON feed required for ChatGPT Shopping eligibility. Generate with angeo/module-openai-product-feed.
  • FAQPage schema — increases citation probability in conversational AI answers for category and how-to queries.

See all eight AEO signals and your complete score in one command:

bin/magento angeo:aeo:audit

Check all 8 AEO signals for your Magento store — robots.txt, llms.txt, schema, product feed, and more.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *