0

Let's say I have a site at www.example.com and I decide I want to have a French version of the same site at the URL www.example.com/fr

But at first, I only want myself to be able to see www.example.com/fr and anything within it (i'd like to block both "regular" visitors and any bots.

Can I block everyone except my IP from just that folder/section? If so is it done via htaccess, robots.txt, a combo of both, some other way?

I know for visitors, I can add this to my htaccess:

order deny,allow
deny from all
allow from (my ip address)

But can I tweak that to say everyone can go to everything EXCEPT the "fr" folder?

And I know for bots (i.e. google) this robots.txt file would be used at the root of my main site, if I wanted to keep bots from visiting:

User-agent: *
Disallow: /

So do I create another robots.txt in the "fr" folder with that in it? Or would it have to be done via the original robots.txt file in the main site root?

user3304303
  • 1,027
  • 12
  • 31
  • 1
    Just put a .htaccess file with your "order deny,allow" section in the folder you want to protect. You can have different htaccess files with different rules in each folder. – M. Eriksson Dec 04 '17 at 21:26
  • use your Ip adress and user agent – Javid Karimov Dec 04 '17 at 21:31
  • 1
    You can only have one `robots.txt` file and it needs to be in the top folder. However, all you need to do there is to add an extra row with `Disallow: /fr`. However, that's not needed if you have a htaccess in the /fr folder, since no one except your IP would be able to read that file anyway. – M. Eriksson Dec 04 '17 at 21:36
  • Makes total sense, and I didn't realize you could have multiple htaccess, so thank you! Add an answer with this info and I'll mark as best if you'd like. – user3304303 Dec 04 '17 at 21:48
  • Added an answer. – M. Eriksson Dec 04 '17 at 21:56
  • 1
    The robots.txt file is a polite notice which is only respected by well behaved bots. mod_authz_host (the component in Apache addressed by your example htaccess file) policy is **enforced** by your server. – symcbean Dec 04 '17 at 22:00

1 Answers1

1

You can have different .htaccess files in each folder, so just put a .htaccess in the /fr folder with the content:

order deny,allow
deny from all
allow from (your ip address)

Regarding robots.txt, you can only have one which needs to be in the web root /. However, if you want to ask robots not to read a specific folder, all you need to do is to add a new row:

Disallow: /fr

robots.txt can contain many Disallow-rows to different files/folders.
Just remember, robots doesn't have to respect your robots.txt-file. It's not a safe way to hide folders.

Note: If you have the above .htaccess in the /fr folder, you don't need to add it to robots.txt as well, since robots won't be able to read that folder anyway.

M. Eriksson
  • 13,450
  • 4
  • 29
  • 40