/agents-traffic
Who reads g9n.io, and how
A transparency page about the AI agents and LLM crawlers we expect to see, the organisation behind each, and our policy stance on them. Read alongside /agents-policy (the long-form rules) and /for-ai (the machine-readable company brief).
In one line
g9n.io welcomes AI agents. Every major public crawler in the table below is allowed across the public surface. The only paths we block are write-side APIs (/api/contact, /api/revalidate) and admin routes — those are operationally sensitive, not anti-AI.
Recognised agents
Our middleware tags requests by User-Agent and writes a structured log entry per request — including bot identification — so an agent's reads are visible to the team even when no human is watching. The list below is the canonical set we recognise; bots not in this list are treated by the default * rule in robots.txt.
| User-Agent | Org | Purpose | g9n's stance |
|---|---|---|---|
GPTBot ↗ | OpenAI | Training data collection for ChatGPT models | allow |
ChatGPT-User ↗ | OpenAI | In-context browsing by ChatGPT when a user asks a question | allow |
OAI-SearchBot | OpenAI | SearchGPT — indexing for the answer engine | allow |
ClaudeBot ↗ | Anthropic | Training data for Claude models | allow |
Claude-Web | Anthropic | Claude.ai in-context browsing for user-asked tasks | allow |
anthropic-ai | Anthropic | Legacy / general Anthropic bot identifier | allow |
PerplexityBot ↗ | Perplexity | Indexing for the Perplexity answer engine | allow |
Perplexity-User | Perplexity | In-context browsing on behalf of a Perplexity user | allow |
Google-Extended ↗ | Bard / Gemini / Vertex training opt-in (separate from Googlebot) | allow | |
Applebot-Extended ↗ | Apple | Apple Intelligence / Siri training opt-in | allow |
CCBot ↗ | Common Crawl Foundation | Common Crawl — public training corpus used by many AI labs | allow |
cohere-ai | Cohere | Training data for Cohere foundation models | allow |
Bytespider | ByteDance | TikTok / Doubao training data | allow |
Live traffic dashboard
Coming soon
We tag every request by bot identity in our edge middleware. The raw stream is already structured and queryable — the public counters / charts on this page will be wired up next. Until then, the per-request log is internal-only.
If you operate an AI crawler and would like to verify how your bot is treated, email info [at] g9n.io and we'll share the data for your User-Agent.
Stable endpoints for agents
GET /api/company-context— canonical JSON snapshot (append?lang=enfor English)GET /api/agents-data— index + all machine-readable markdown bundledGET /api/agents-data/<file>.md— individual fileGET /llms.txt— LLM crawler manifestGET /llms-full.txt— aggregated dump of all /agents-data
Operator contact
If your bot isn't on this list and you'd like it to be — or if you need an allow-list adjustment — please reach info [at] g9n.io. We update this page when crawler operators publicly announce a new User-Agent.
Last updated: 2026-05-26 · Policy: /agents-policy