robots.txt file is probably invalid

Question

this is my robots.txt. I want to only allow the base url domain.com for indexing and disallow all sub urls like domain.com/foo and domain.com/bar.html.

User-agent: *
Disallow: /*/

Because I am not sure whether this is a valid syntax I tested it using Google Webmaster Tools. It shows me this message.

robots.txt file is probably invalid.

Is my file valid? Is there a better way of only allowing the base url for indexing?

Update: Google downloaded my robots.txt 4 hours ago. I think thats why it doesn't work. I will wait some time and if the problem stays I will update my question again.

I read this: http://stackoverflow.com/questions/5206602/robots-txt-how-to-allow-access-only-to-domain-root-and-no-deeper but did not understand the answer. — danijar, Apr 26 '12 at 19:55
Here's another similar question that might help: http://stackoverflow.com/q/43427/669611 — magzalez, Apr 26 '12 at 20:40

magzalez · Accepted Answer · 2012-04-26T20:55:01.617

0

Here is a link to a validator. It might help you work through any errors in the file.

Robots.txt Checker

I checked on another validator, robots.txt Checker, and this is what I got for the second line:

Wildcard characters (like "*") are not allowed here The line below must be an allow, disallow, comment or a blank line statement

This might be what you're looking for:

User-Agent: *
Allow: /index.html
Disallow: /

This assumes your homepage is index.html.

If index.php is your homepage, you should be able to swap out index.html for index.php.

User-Agent: *
Allow: /index.php
Disallow: /

On my dynamic websites that run through index.php, going to mydomain.com/index.php still takes me to the homepage, so the above should work.

edited Apr 26 '12 at 20:55

answered Apr 26 '12 at 20:16

magzalez

1,396
2
14
25

index.html isn't my homepage because all requests lead to index.php wich manages the content and layout. So I need to allow the base domain only. – danijar Apr 26 '12 at 20:46
does going to yourdomain.com/index.php take you to the homepage? – magzalez Apr 26 '12 at 20:52
yes because of redirection to yourdomain.com by .htaccess – danijar Apr 26 '12 at 21:02

robots.txt file is probably invalid

1 Answers1