2

I just ran into a robots.txt that looks like this:

User-agent: *
Disallow: /foobar

User-agent: badbot
Disallow: *

After disallowing only a few folders for all, does the specific badbot rule even apply?

Note: This question is merely for understanding the above ruleset. I know using robots.txt is not a proper security mechanism and I'm neither using nor advocating it.

unor
  • 92,415
  • 26
  • 211
  • 360
user857990
  • 1,140
  • 3
  • 14
  • 29
  • if badbot does not check for robots.txt, you will have to dind another way to block requests – Uriil Jul 08 '14 at 10:10
  • "I'm neither using nor advocating it." Oh, you really should. Of course this has nothing to do with security, but with managing the search index for your site. This becomes really important when doing some SEO. – feeela Jul 08 '14 at 10:28

1 Answers1

1

Each bot only ever complies to at most a single record (block).

A block starts with one or more User-agent lines, typically followed by Disallow lines (at least one is required). Blocks are separated by blank lines.

A bot called "badbot" will look for a record with the line User-agent: badblock (or similar, as the bot "should be liberal in interpreting this field"). If no such line is found, it will look for a record with the line User-agent: *. If even this doesn’t exist, the bot is allowed to do everything (= default).

So in your example, the bot called "badbot" will follow only the second record (you probably mean Disallow: / instead of Disallow: *), while all other bots only follow the first record.

Community
  • 1
  • 1
unor
  • 92,415
  • 26
  • 211
  • 360