Robot.txt is meant for crawler and spiders. Using robot.txt web masters can guide and instruct crawler to specific areas of the website. Web masters can disallow crawler’s access to certain pages and folders of the website. One can also specify the crawl speed with which a crawler can index the site.
“Robot.txt is placed at the root folder of the site”
Webmaster can specify three instructions for crawlers namely
- disallow : which tells the robot about sections which it cannot visit.
- crawl delay: speed at which crawler crawls the site.
- sitemap: Tells the crawler about all the pages associated with the Site and their URL.
Sitemap instruction is very useful for the Search Engine Optimization.
Example of Robots.txt:
User-agent: * Disallow: /javascript/ Disallow: /css/ Disallow: /images/ sitemap:www.robotdevpaliwal.com/sitemap.xml
In this example robots wont be able to access three folders namely javascript, css, images using any user-agent.
Sitemap command tells the robot the location of the sitemap file.
Another example
User-agent: * Disallow: /
This will not allow the robot to index even a single page from the server.
Another example
User-agent: xyzBot Disallow: /
This will disallow xyzBot to crawl the website.
Another example
User-agent: * Disallow: /password.php
This will disallow acces to password.php in the root folder.



























































