Build a robots.txt file visually. Templates for WordPress, Next.js, Block AI crawlers, and more.
User-agent: *
The robots.txt file is a plain text file placed at the root of your website (e.g. https://yourdomain.com/robots.txt) that instructs search engine crawlers which parts of your site to crawl and which to skip. It is part of the Robots Exclusion Protocol, a voluntary standard that well-behaved bots follow.
A correctly configured robots.txt helps your SEO by directing Google's crawl budget toward your important pages and away from thin, duplicate, or sensitive content. A misconfigured robots.txt can accidentally block your entire site from being indexed — a catastrophic SEO error that is more common than you might think.
# This is a comment User-agent: * # Applies to all crawlers Disallow: /admin/ # Block this path Disallow: /api/ # Block API endpoints Allow: /api/public/ # But allow this specific path User-agent: Googlebot # Rule only for Google Allow: / # Google can crawl everything User-agent: GPTBot # Block OpenAI's crawler Disallow: / # From everything Sitemap: https://example.com/sitemap.xml # Your sitemap
Since 2023, many website owners have begun blocking AI training crawlers from scraping their content. The major AI crawlers and their user-agent strings:
| Crawler | Company | Purpose | Block? |
|---|---|---|---|
| GPTBot | OpenAI | ChatGPT training data | Your choice |
| Google-Extended | Bard/Gemini training | Your choice | |
| CCBot | Common Crawl | Open dataset (used by many AI) | Your choice |
| anthropic-ai | Anthropic | Claude training data | Your choice |
| Googlebot | Search indexing | No — kills SEO | |
| Bingbot | Microsoft | Bing search | No — kills Bing traffic |