URL: /rules/ax/ai-crawlers

---
title: "AI Crawler Access"
description: "Reports which AI-agent crawlers robots.txt allows or blocks"
---

Reports which AI-agent crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, …) your `robots.txt` allows or blocks at the site root. AI assistants and answer engines reach your content through these named user-agents, so blocking them keeps your pages out of those tools.

| | |
|---|---|
| **Rule ID** | `ax/ai-crawlers` |
| **Category** | [Agent Experience](/rules/ax) |
| **Scope** | Site-wide |
| **Severity** | info |
| **Weight** | 1/10 |

<Note>This rule is a **recommendation** — it never penalizes your score. Blocking AI crawlers is a legitimate choice; the rule just makes the current policy visible.</Note>

## What it checks

For each well-known AI crawler, the rule reads your parsed `robots.txt` and reports whether the user-agent is **allowed** or **fully blocked** (`Disallow: /` with no re-permitting `Allow: /`). A user-agent with its own group takes precedence over the wildcard `*` group, matching how real crawlers resolve robots.txt.

Crawlers covered include:

- **OpenAI** — `GPTBot` (training), `OAI-SearchBot` (search), `ChatGPT-User` (live fetch)
- **Anthropic** — `ClaudeBot` (training), `Claude-User` / `Claude-SearchBot`, `anthropic-ai`
- **Google** — `Google-Extended` (Gemini/Vertex training)
- **Common Crawl** — `CCBot`
- **Perplexity** — `PerplexityBot`, `Perplexity-User`
- **Others** — `Applebot-Extended`, `Bytespider`, `Amazonbot`, `Meta-ExternalAgent`, `cohere-ai`, `DuckAssistBot`, `MistralAI-User`, `AI2Bot`, `Diffbot`, `YouBot`

## Solution

If you want AI visibility, make sure your `robots.txt` does not `Disallow: /` these user-agents. To opt out of model training while staying answerable in assistants, block the training bots (`GPTBot`, `ClaudeBot`, `Google-Extended`, `CCBot`) but keep the live-fetch / search agents (`ChatGPT-User`, `Claude-User`, `OAI-SearchBot`, `PerplexityBot`) allowed.

```txt robots.txt
# Block training, allow live answer engines
User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: ChatGPT-User
Allow: /

User-agent: PerplexityBot
Allow: /
```

## Enable / Disable

### Disable this rule

```toml squirrel.toml
[rules]
disable = ["ax/ai-crawlers"]
```

### Enable only this rule

```toml squirrel.toml
[rules]
enable = ["ax/ai-crawlers"]
disable = ["*"]
```
