SEO

Robots.txt

Glossary  |  2 mins

Read the next term in our glossary

Robots.txt is a text file in the root folder of a web domain, used to instruct search engine crawlers on how to crawl the pages of a website. In the context of SEO, robots.txt files are mainly used to dictate which pages crawlers can and cannot access.

Though the exact purpose of a robots.txt file can vary from one to the next, they all follow a basic format:

User-agent: X
Disallow: Y

The “user-agent:” field is used to specify the exact bot that you’re instructing, whereas the “disallow:” field is for specifying pages and sub-sections that you don’t want the bot to access.

If, for example, you’re building a page on an ecommerce site that goes over the details of a loyalty program, and the page is part of your sitemap but not ready to be seen, you may want to stop it from being crawled and indexed. In this situation, your robots.txt file might look like this:

User-agent: googlebot
Disallow:/rewards

Though it’s possible to give directives to some bots and not others, in most cases you’ll want to set the same rules for all bots. In this situation, you can put an asterisk (*) in the user-agent field to send the same instructions to any and all bots that visit your site.

Robots.txt files can also be used to help crawlers auto-discover your XML sitemap through the “sitemap:(your sitemap URL)” directive. This will allow crawlers from all search engines to crawl your whole site without having to manually submit your sitemap through the relevant tools.

Remember that robots.txt files don’t need to be complex, and even larger sites often only have a couple of directives in their file. To maximise the chances of indexation and ranking, robots.txt should only be used to block sections that you never want to appear in search results, and to make it easier for crawlers to discover your sitemap.

Back to glossary

EBook Icon eBook

Download a PDF version of our glossary for everything you need to know

Download

Robots.txt FAQs

What is a robots txt file used for?

Robots.txt files are used to give instructions to search engine crawlers that visit a website. They’re mainly used to block off content the site owner doesn’t want to show in search results, but can also be used to show crawlers the location of an XML sitemap.

Download
Questions to ask your link builder

eBook

Download