What is robots.txt?
Robots.txt file is used by the webmasters to give instructions to robots on how to crawl their site. This includes instructions like which directory to crawl which to not, which robot is allowed and which is not, and the location of sitemap etc.
Basic Configuration of robots.txt
To allow all bots to crawl your website the robots.txt file will look like this:
User-agent: *To disallow all the robots from crawling your site you can use the following code:
Allow: /
User-agent: *To disallow all bots except google, you can use following configuration:
Disallow: /
User-agent: *To disallow bots from crawling some particular directories and pages you can use following code:
Disallow: /
User-agent: Googlebot
Allow: /
User-agent: *You can also add the link of your sitemap in the robots.txt file for auto discovery using following syntax:
Disallow: /admin/
Disallow: /css/
Disallow: /example.htm
Sitemap: http://skills2earn.com/sitemap.xmlReference
www.sitemaps.org
No comments:
Post a Comment