Robots.txt生成器
为网络爬虫生成带有自定义允许和禁止规则的robots.txt文件
Global Parameters
Rule Group #1
# Generated by freetool24.com User-agent: * Disallow: /admin/ Disallow: /private/ Allow: / Sitemap: https://example.com/sitemap.xml
User-agent: *
Applies the rule to all bots and web crawlers globally.
Disallow: /
Prevents the targeted bot from crawling any page on the site.
Free Robots.txt Generator
A robots.txt file is a plain-text file at the root of your website that instructs crawlers — Googlebot, Bingbot, and others — which pages to crawl or skip. This robots.txt generator lets you build valid Allow, Disallow, User-agent, and Sitemap directives visually, then copy the file instantly.
How to create a robots.txt file
- 1
Set your Sitemap URL
Enter your sitemap address, such as https://yoursite.com/sitemap.xml. Search engines use it to discover your most important URLs.
- 2
Choose a User-agent
Use * to target all crawlers, or name a specific bot such as Googlebot, Bingbot, Googlebot-Image, or GPTBot.
- 3
Add Disallow rules
List the paths you want crawlers to skip, such as /admin/, /private/, /cart/, or internal search results.
- 4
Add Allow rules
Use Allow to reopen a specific sub-path inside a blocked folder, such as allowing /admin/help/ while blocking /admin/.
- 5
Copy and deploy
Copy the generated robots.txt file and upload it to the root of your domain: https://yourdomain.com/robots.txt.
Common robots.txt templates
Allow all crawlers
User-agent: * Disallow: — gives search engines full crawl access.
Block a private folder
User-agent: * Disallow: /admin/ — keeps admin or staging sections out of crawler paths.
Add sitemap discovery
Sitemap: https://example.com/sitemap.xml — points crawlers to your canonical sitemap.
Target one bot
User-agent: Googlebot Disallow: /tmp/ — applies the rule only to Googlebot.
Common mistakes to avoid
Blocking the whole site
Disallow: / blocks crawling for the selected bot. Use it only for private or staging sites.
Using robots.txt for secrets
Blocked URLs are still visible in the file. Do not put private paths or sensitive endpoints there as a security measure.
Forgetting the root location
The file must live at /robots.txt on the host it controls. A robots file in a subfolder is ignored.
Confusing crawl and index
Robots.txt controls crawling. Use noindex on accessible pages when you need to prevent indexing.
FAQ
Does robots.txt prevent indexing?
Robots.txt controls crawling, not indexing. A URL can still appear in search results if Google discovers it from links. Use a noindex meta tag to prevent indexing.
Where does robots.txt go?
At the root of your domain — https://yourdomain.com/robots.txt. Subdirectory placement does not work.
Should I include my sitemap in robots.txt?
Yes. A Sitemap directive helps search engines discover your canonical sitemap URL quickly.
Is robots.txt case-sensitive?
Yes. Disallow: /Admin/ and Disallow: /admin/ are treated as different paths by most crawlers.
继续探索
您可能喜欢的其他 SEO 工具…
Meta标签生成器
生成完整的SEO HTML meta标签,包括Open Graph和Twitter Card标签
关键词密度检查器
分析内容中的关键词频率和密度百分比
字数统计
统计字数、字符、句子与预估阅读时间
URL路径生成器
将任意文本转换为带有小写字母和连字符的URL友好slug
XML网站地图生成器
从URL列表生成XML网站地图,提高搜索引擎索引效果
Open Graph预览
预览页面在Facebook、Twitter等社交媒体上分享时的显示效果
搜索结果预览
预览页面在Google搜索结果中的显示效果
可读性检查器
计算内容的Flesch-Kincaid可读性分数和年级水平