0

I have few questions regarding robots.txt

  1. If I have following line in robots.txt

    Disallow: /catalog/category/view/id/6

    will this block the url http://example.com/catalog/category/view/id/61 as well?

  2. If I have

    Disallow: /*education

    will this block the url http://example.com/some/uri/education as well as http://example.com/some/uri/education/another/uri

  3. what makes the difference whether I have / at the end of each rule?

  4. Is * necessary in Disallow: /disallowme* if I want to disallow all url that starts with http://example.com/disallowme

kidonchu
  • 785
  • 1
  • 9
  • 20

1 Answers1

0

(Q1)

Disallow: /catalog/category/view/id/6

will block any URL whose path starts with /catalog/category/view/id/6. So yes, it will also block http://example.com/catalog/category/view/id/61.

(Q3) A slash is just another character, nothing special about it.

(Q2, Q4) The * character has no special meaning in the original robots.txt specification, it’s just another character, like / and a. Some parsers (for example, Google’s) use * for pattern matching. You’d have to check their documentation about it (each parser might implement this differently, as there is no specification about it).

So parsers that follow the original specification will not block http://example.com/disallowme when following Disallow: /disallowme*. They would block, for example: http://example.com/disallowme*foo. As explained above, whatever you specify in Disallow is always an URL path prefix.

Community
  • 1
  • 1
unor
  • 92,415
  • 26
  • 211
  • 360