27

I want to add nofollow and noindex to my site whilst it's being built. The client has request I use these rules.

I am aware of

<meta name="robots" content="noindex,nofollow">

But I only have access to the robots.txt file.

Does anyone know the correct format I can use to apply noindex, nofollow rules via the robots.txt file?

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
MeltingDog
  • 14,310
  • 43
  • 165
  • 295

4 Answers4

48

noindex and nofollow mean that you do not want any search engines like Google to crawl your website.

So, simply put the following code into your robots.txt file:

User-agent: *
Disallow: /

It means noindex and nofollow.

Kate Orlova
  • 3,225
  • 5
  • 11
  • 35
Ankit Gujarati
  • 685
  • 6
  • 6
  • 5
    Crawling and indexing are not the same. For example, Google Search might not crawl a page (because it’s disallowed in robots.txt), but still index it. – unor Aug 24 '17 at 15:38
  • 4
    How exactly can google index content if it is not allowed to read it? – SeaBiscuit Aug 25 '18 at 08:50
  • 3
    google could have previously indexed the page, or it grabs a direct link somewhere else – Gambai Sep 25 '18 at 20:59
  • 1
    this answer is not correct. see the comments on this [answer](https://stackoverflow.com/a/39495598/11395898) – Ali Shefaee Aug 05 '22 at 15:13
  • I have that and google search console said it still indexed the page and that I need to add "no index" instead of "block". – Mike Jun 19 '23 at 15:20
7

There is a non-standard Noindex field, which Google (and likely no other consumer) supported as experimental feature.

Following the robots.txt specification, you can’t disallow indexing nor following links with robots.txt.

For a site that is still in development, has not been indexed yet, and doesn’t get backlinks from pages which may be crawled, using robots.txt should be sufficient:

# no bot may crawl 
User-agent: *
Disallow: /

If pages from the site are already indexed, and/or if other pages which may be crawled link to it, you have to use noindex, which can not only be specified in the HTML, but also as HTTP header:

X-Robots-Tag: noindex, nofollow
unor
  • 92,415
  • 26
  • 211
  • 360
-2
  • Noindex tells search engines not to include pages in search results, but can follow links (and also can transfer PA and DA)
  • Nofollow tells bots not to follow the links. We also can combine noindex with follow in pages we don´t want to be indexed, but we want to follow the links
  • 3
    While these are facts about things mentioned in the question, you haven't answered the question at all. – Quentin Jul 27 '18 at 11:32
  • I apologize for the nit-pickiness of SO. This is just how it is; things are usually expected to be a precise way. – CubicInfinity Mar 30 '22 at 16:59
-4

I just read this thread, and thought to add an idea.

In case one wants to place a site under construction or development, not vieawable to unauthorized users I think this idea is safe although a bit of IT proficiency is required.

There is a "hosts" file on any operating system, that works as a manual repository of DNS entries, overriding an online DNS server.

In Windows, it is under C:\Windows\System32\drivers\etc\hosts and linuxes distros (Android, too) I know have it under /etc/hosts. Maybe in OSX it's the same.

The idea is to add an entry like

xxx.xxx.xxx.xxx anyDomain.tld

to that file. It is important that the domain is created in your server/provider, but it is not sent to the DNS servers yet.

What happens: while the domain is created in the server, it will respond to calls on that domain, but no one else (no browsers) in the internet will know the IP address to your site, besides the computers you have added the above snippet to the hosts file.

In this situation, you can add the change to anyone interested in seeing your site (and has your authorization), end no one else will be able to see your site. No crawler will see it until you publish the DNS online.

I even use it for a private file server that my family share.

Here you can find a thorough explanation on how to edit the hosts file: https://www.howtogeek.com/howto/27350/beginner-geek-how-to-edit-your-hosts-file/

  • This is offtopic and does not answer the question. Sorry :) You might edit your answer. PS: Welcome! – s1x Jan 23 '20 at 20:34