5

I know that the file robots.txt is used for to block web crawler of index content sites of third parties.

However, if the goal this file is to delimit a private area of site or to protect a private area, which is the sense in try to hidden the content with robots.txt, if all will can be see in GitHub repository?

My question extend the examples using custom domain.

Is there motivation in to use file robots.txt inside of GitHub pages? Yes or no? And why?

Alternative 1
For that content stay effectively hidden, then will been need to pay for the web site is to get a private repository.

unor
  • 92,415
  • 26
  • 211
  • 360
jonathasborges1
  • 2,351
  • 2
  • 10
  • 17
  • The motivation is the same as for any other web site: prevent robots from crawling a part of it. It has nothing to do with it being private or inaccessible: if it was private or inaccessible, robots would have no way to access it anyway. – JB Nizet Dec 22 '17 at 08:03

1 Answers1

3

The intention of robots.txt is not to delimit private areas, because robots don't even have access to them. Instead it's in case you have some garbage or whatever miscellaneous that you don't want to be indexed by search engines or so.

Say for example. I write Flash games for entertainment and I use GitHub Pages to allow the games to check for updates. I have this file hosted on my GHP, all of whose content is

10579
2.2.3
https://github.com/iBug/SpaceRider/tree/master/SpaceRider%202

It contains three pieces of information: internal number of new version, display name of new version, and download link. Therefore it is surely useless when indexed by crawlers, so when I have a robots.txt that's a kind of stuff I would keep away from being indexed.

iBug
  • 35,554
  • 7
  • 89
  • 134
  • Then the file robots.txt serves only for to hide garbage of the my repository? not serves for to protect a restricted area of site? – jonathasborges1 Dec 22 '17 at 08:21
  • 3
    @JonathasB.C. Even without `robots.txt`, crawlers **don't have access to** restricted areas. It tells crawlers to ignore certain areas that they **have access to**. – iBug Dec 22 '17 at 08:31