[CentOS] favicon.ico and robots.txt

centos at 911networks.com centos at 911networks.com
Fri Aug 28 17:53:30 UTC 2009

On Fri, 28 Aug 2009 08:54:02 -0700
Taproot <webmaster at taproothosting.com> wrote:

> Robots.txt is a file that allows or denies robots from indexing or 
> crawling the site if they behave as they should.

It's a common misconception. Robots.txt does NOT allow or deny...
Robots.txt only SUGGESTs what they should crawl or not. It's up to
the crawler to respect the robots.txt file. 

The big ones like Google, Yahoo, Microsoft do follow the instruction
of the robots.txt file, but many, especially the one harvesting
emails, photos..., do not follow the instructions of the robots.txt.

When the network has to work

More information about the CentOS mailing list