Skip to main content
Nexora Web Design
SEO

What is Robots.txt?

A text file at /robots.txt that tells search engines which parts of your site to crawl and which to ignore.

Also known as: robots file

Robots.txt is a plain text file at the root of your domain (e.g., yoursite.com/robots.txt) that gives crawlers instructions. Most importantly, it can block crawlers from accessing specific directories — useful for admin panels, search result pages, faceted navigation, and other crawl-budget wasters.

A proper small-business robots.txt typically allows everything except /admin/, /api/, and any duplicated paths. It also references the sitemap location so crawlers can find it.

What robots.txt is NOT: a privacy mechanism. Blocking a URL in robots.txt doesn't keep it out of Google's index if other pages link to it — it just keeps Google from crawling it. To actually deindex a page, use a noindex meta tag (and don't block the page in robots.txt, since Google has to crawl it to see the noindex).

Why it matters

An accidentally over-aggressive robots.txt can deindex your entire site. Many small businesses have robots.txt files that block legitimate URLs and don't realize.

Want a free audit of your site against these standards?

Senior strategist review, prioritized fix list, written diagnostic. Free.

Request a free audit