tech

How to fix: robots.txt reachable

Why this matters

robots.txt is the standard mechanism for telling crawlers what they can and can't fetch. Without it, crawlers fall back to crawling everything they can find.

Background

A reachable /robots.txt tells crawlers which paths to skip + where the sitemap lives. Even an effectively-empty allow-all robots.txt is better than 404 — it signals intent + makes the file inspectable.

References

RFC 9309 (Robots Exclusion Protocol) · Google Search Central

How to fix

Code snippet for each stack we cover. Pick the one matching your server / framework.

nginx

Place a robots.txt file at the site root; nginx serves it as static.

apache

Same — DocumentRoot/robots.txt.

cloudflare

Pages / Workers can serve a Worker route returning the robots.txt content.

wordpress

Yoast / RankMath generate /robots.txt OR upload manually via FTP.

flask

Add @app.route('/robots.txt') returning a text response with User-agent: * + Allow / Disallow rules + Sitemap directive.

express

app.use('/robots.txt', express.static(...))

rails

Place in public/robots.txt — Rails serves it.

Verify it's working

curl -sI https://your-site/robots.txt — must return 200, not 404.

Want to know if your site has this issue?

Run a free 53-check audit — security, GDPR, NIS2, and technical SEO.

Audit my site →