Robots.txt Generator

Build a robots.txt file visually. Templates for WordPress, Next.js, Block AI crawlers, and more.

Templates

Rule 1

User-agent

Disallow paths

Allow paths

Sitemap URL (optional)

Crawl-delay (optional, seconds)

robots.txt preview

User-agent: *

What Is robots.txt and Why Does It Matter for SEO?

The robots.txt file is a plain text file placed at the root of your website (e.g. https://yourdomain.com/robots.txt) that instructs search engine crawlers which parts of your site to crawl and which to skip. It is part of the Robots Exclusion Protocol, a voluntary standard that well-behaved bots follow.

A correctly configured robots.txt helps your SEO by directing Google's crawl budget toward your important pages and away from thin, duplicate, or sensitive content. A misconfigured robots.txt can accidentally block your entire site from being indexed — a catastrophic SEO error that is more common than you might think.

robots.txt Syntax Reference

# This is a comment
User-agent: *              # Applies to all crawlers
Disallow: /admin/          # Block this path
Disallow: /api/            # Block API endpoints
Allow: /api/public/        # But allow this specific path

User-agent: Googlebot      # Rule only for Google
Allow: /                   # Google can crawl everything

User-agent: GPTBot          # Block OpenAI's crawler
Disallow: /                # From everything

Sitemap: https://example.com/sitemap.xml   # Your sitemap

Blocking AI Crawlers — The 2024 Trend

Since 2023, many website owners have begun blocking AI training crawlers from scraping their content. The major AI crawlers and their user-agent strings:

Crawler	Company	Purpose	Block?
GPTBot	OpenAI	ChatGPT training data	Your choice
Google-Extended	Google	Bard/Gemini training	Your choice
CCBot	Common Crawl	Open dataset (used by many AI)	Your choice
anthropic-ai	Anthropic	Claude training data	Your choice
Googlebot	Google	Search indexing	No — kills SEO
Bingbot	Microsoft	Bing search	No — kills Bing traffic

Does robots.txt affect my Google rankings?

Directly, no — robots.txt just controls crawling, not ranking signals. But blocking important pages from crawling means Google cannot index them, which absolutely prevents them from ranking. Never block pages you want to appear in search results.

What is the difference between robots.txt Disallow and the noindex meta tag?

Disallow prevents crawling — Google never reads the page. noindex is read by Google after crawling and tells it not to include the page in search results. Use noindex (not robots.txt Disallow) when you want Google to crawl but not index a page. Use Disallow to save crawl budget on pages you definitely never want indexed.

Can malicious bots ignore my robots.txt?

Yes. robots.txt is a voluntary standard. Legitimate search engines (Google, Bing) respect it. Malicious scrapers, spam bots, and content thieves may ignore it entirely. Use server-level blocking (firewall rules, rate limiting) for actual security. robots.txt is an SEO and crawl budget tool, not a security measure.