Allow and Disallow in Robots.txt

Question

http://www.robotstxt.org/orig.html says:

Disallow: /help disallows both /help.html and /help/index.html

Now, google.com/robots.txt lists:

Disallow: /search  
Allow: /search/about

Upon running robotparser.py, it returns false for both the above cases in Googles robots.txt.

Would somebody please explain me, what's the use of Allow in Allow: /search/about as it would return a false based on the Disallow entry above it?

score 2 · Accepted Answer · answered Apr 11 '16 at 11:41

The module documentation for robotparser and its Python 3 counterpart, urllib.robotparser, mention that they use the original specification. This specification does not have an Allow directive; that is a non-standard extension. Some major crawlers support it, but you (obviously) don't have to support it to claim compliance.

Allow and Disallow in Robots.txt

1 Answers1

Linked