0

I'm trying to request this page https://health.usnews.com/best-hospitals/rankings/cancer using Python(2.7) requests module. But it's giving 403 response(It's working fine on my local machine but not working on the server).

requested the page bypassing headers and cookies in the request. But got 403 response. Also, tried the Session object as well as suggested in Python requests - 403 forbidden - despite setting `User-Agent` headers

>>> requests.get('https://health.usnews.com/best-hospitals/rankings/cancer')
<Response [403]>
>>> requests.get('https://health.usnews.com/best-hospitals/rankings/cancer', headers=h)
<Response [403]>

How can we get the proper response from that page?

Thank you in advance!

silpa
  • 57
  • 5

1 Answers1

1

User-Agent in headers is needed when making request:

import requests

url = 'https://health.usnews.com/best-hospitals/rankings/cancer'
headers = {'User-Agent':'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0'}

txt = requests.get(url, headers=headers).text
print(txt)

Prints:

<!doctype html>
<html class="no-js" lang="">
    <head>
... and so on.
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Thank you Andrej! Yeah, I tried that as well. It's working on my local machine. When I try the same from a server, It's giving 403 response. Forgot to mention this in question. Will update that. Any thoughts on this? – silpa Jul 10 '19 at 09:01
  • @slipa Are you using proxy? Maybe your IP is blacklisted so you want to try different IP. – Andrej Kesely Jul 10 '19 at 09:17