2

I have my api at api.website.com which requires no authentication.

I am looking for a way to disallow google from indexing my api.

Is there a way to do so?

I already have the disallow in my robots at api.website.com/robots.txt but that just prevents google from crawling it.

User-agent: *
Disallow: /

The usual way would be to remove the Disallow and add a noindex meta tag but it's an API hence no meta tags or anything.

Is there any other way to do that?

mauxtin
  • 388
  • 2
  • 5
  • 15
  • 1
    You might have a better chance of getting help on https://webmasters.stackexchange.com/. See e.g. [How to prevent a PDF file from being indexed by search engines?](https://webmasters.stackexchange.com/a/49124) or [Why do Google search results include pages disallowed in robots.txt?](https://webmasters.stackexchange.com/a/24571). Though maybe see [htaccess header for specific domain?](https://stackoverflow.com/q/49250244/3744182) on stack overflow. – dbc Apr 02 '19 at 00:17

2 Answers2

3

It seems like there is a way to add a noindex on api calls.

See here https://webmasters.stackexchange.com/questions/24569/why-do-google-search-results-include-pages-disallowed-in-robots-txt/24571#24571

The solution recommended on both of those pages is to add a noindex meta tag to the pages you don't want indexed. (The X-Robots-Tag HTTP header should also work for non-HTML pages. I'm not sure if it works on redirects, though.) Paradoxically, this means that you have to allow Googlebot to crawl those pages (either by removing them from robots.txt entirely, or by adding a separate, more permissive set of rules for Googlebot), since otherwise it can't see the meta tag in the first place.

mauxtin
  • 388
  • 2
  • 5
  • 15
0

It is strange Google is ignoring your /robots.txt file. Try dropping an index.html file in the root web directory and adding the following between the <head>...</head> tags of the web page.

<meta name="robots" content="noindex, nofollow">
Matt Makris
  • 52
  • 1
  • 7
  • I don’t think OP is saying that Google ignores the robots.txt. OP is saying that robots.txt prevents crawling, but it doesn’t prevent indexing (which is correct). – unor Apr 03 '19 at 06:13