Skip to content

Robots.txt & Indexing

You don’t need every page crawled. But you do need the right ones.

What Is robots.txt?

Instructions for Crawlers

robots.txt is a plain text file at the root of your domain that tells search engine bots which URLs they can and can’t crawl.

Important:

How It Works?

Example:

				
					User-agent: *
Disallow: /checkout/
Allow: /blog/
				
			

Means all bots can crawl everything except the /checkout/ folder.

Common robots.txt Mistakes

What NOT to Do

Checklist:

Means all bots can crawl everything except the /checkout/ folder.

Best Practices

Crawl Smarter, Not Harder

Tips:

Related Technical Pages

Build Around Your Stack

XML Sitemaps

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Indexing Basics

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Canonical Tags

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Control the Crawl

Don’t let Google waste time on URLs that don’t help you rank.