-2

http://www.robotstxt.org/orig.html says:

Disallow: /help disallows both /help.html and /help/index.html

Now, google.com/robots.txt lists:

Disallow: /search  
Allow: /search/about  

Upon running robotparser.py, it returns false for both the above cases in Googles robots.txt.

Would somebody please explain me, what's the use of Allow in Allow: /search/about as it would return a false based on the Disallow entry above it?

Romy
  • 11
  • 2

1 Answers1

2

The module documentation for robotparser and its Python 3 counterpart, urllib.robotparser, mention that they use the original specification. This specification does not have an Allow directive; that is a non-standard extension. Some major crawlers support it, but you (obviously) don't have to support it to claim compliance.