7

So this is the link while I tried using Twitter the image somehow doesn't work, while it works for Facebook.

It is working for Facebook only but for Twitter I am getting issue:

WARN: The image URL https://scontent.xx.fbcdn.net/v/t31.0-8/19388529_1922333018037676_3741053750453855177_o.jpg?_nc_cat=0&oh=ba7394f2a6af68cb4b78961759a154f1&oe=5B6BC349 specified by the 'twitter:image' metatag may be restricted by the site's robots.txt file, which will prevent Twitter from fetching it.

Dont know what is causing this here is my robots.txt:

User-agent: *
Disallow: /translations
Disallow: /manage
Disallow: /ecommerce

Here is the link to replicate the issue: https://invoker.pvdemo.com/album?album=1422199821384334&name=gallery

unor
  • 92,415
  • 26
  • 211
  • 360
ujwal dhakal
  • 2,289
  • 2
  • 30
  • 50

2 Answers2

5

Your robots.txt is only relevant for your URLs. For an image hosted at https://scontent.xx.fbcdn.net/, the relevant robots.txt is https://scontent.xx.fbcdn.net/robots.txt.

Currently, this robots.txt blocks everything:

User-agent: *
Disallow: /

As documented under URL Crawling & Caching, Twitter’s crawler (Twitterbot) respects the robots.txt:

If an image URL is blocked, no thumbnail or photo will be shown.

unor
  • 92,415
  • 26
  • 211
  • 360
  • But this one works https://invoker.pvdemo.com/post?post=1422199391384377_1869549873315991 it has also the same robots.txt – ujwal dhakal Apr 04 '18 at 05:04
  • @ujwaldhakal: The example from your question seems to work (now), too, right? – unor Apr 04 '18 at 05:07
  • But what is really going behind the scene i am really confused.. i dont knw if i am doing it right or not – ujwal dhakal Apr 04 '18 at 05:09
  • @ujwaldhakal: No idea. Now suddenly the validator doesn’t work at all for those two pages ("Fetching the page failed because other errors"). Anyway, as Twitter’s documentation says they respect the robots.txt for the image, you might want to host them on a different host (that is allowed to be crawled), if you want to display the image in the Twitter Card. – unor Apr 04 '18 at 05:17
  • Thank you .. i will post this issue on twitter hope they will give me a valid reason. – ujwal dhakal Apr 04 '18 at 08:05
1

You can also configure your robots.txt to have explicit privileges for different crawlers:

User-agent: facebookexternalhit
Disallow:

User-agent: Twitterbot
Disallow:

Google has great docs about it here: https://developers.google.com/search/docs/advanced/robots/create-robots-txt

https://gist.github.com/peterdalle/302303fb67c2bb73a9a09df78c59ba1d

Marsellus
  • 93
  • 3
  • 12