-1

I have a simple requests code that pulls from ESPN's RSS feed.

import requests
requests.get('https://www.espn.com/espn/rss/news')

it was working a few weeks ago, and now i'm getting a 403 error. Definitely not a threshold limit, as this is my first time running it in weeks. I read 403 could mean forbidden and you need a login, but if you simply input https://www.espn.com/espn/rss/news into your web browser all the relevant info comes up. Any idea why requests can't grab it suddenly?

kaysuez
  • 47
  • 1
  • 7

1 Answers1

1

it's because of your User-Agent

this is how you can alter user agent data in requests

import requests

res = requests.get('https://www.espn.com/espn/rss/news', headers={
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36"
})
print(res)

now it returns status code 200

  • Looks like they put Cloudfront in front of the API to block unwanted traffic. – ybl Aug 29 '23 at 03:45
  • 1
    Putting a reasonable, honest user agent is a basic sign of respect for the API you're using. If you're trying to use an API responsibly (and not doing something sketchy where you need to pretend to be someone else), then your user agent should be something that can identify *you*, like `kaysuez-app/0.1.0 (some.email.address@gmail.com`, not just a fake Google Chrome string. – Silvio Mayolo Aug 29 '23 at 03:52
  • @SilvioMayolo I do not see any reason why a soap api might want to block python requests user agent apis are meant to be used via scripting languages I recon they didn't mean to block requests from python, although I'm not comfortable sharing my email with a public api I do not think anyone would be upset that I made up my user agent and if they cared about having something like statistics of user devices they would have never blocked python default user agent in the first place – X-_-FARZA_ D-_-X Aug 29 '23 at 04:08