2

I am running the code below with requests on python 2.7 on OSX

html_response = requests.get(website_variable)
if html_response.status_code != 200:
    print "There was an error! Status code = " + str(html_response.status_code)
    print html_response.content
    sys.exit(1)

I get an HTML response of 503 from the website I am trying to get the response from. I also found a response from the website to turn on Javascript.

<noscript><h1 data-translate="turn_on_js" style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1></noscript>

Am I doing trying to gather information from this website in the correct way? Do I need to turn on a setting or feature in the requests module?

Ian Wheeler
  • 21
  • 1
  • 1
  • 3
  • 1
    Have a look at [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) - you need more than just `requests` for JavaScript. – bastelflp Nov 12 '17 at 11:41
  • 1
    Thank you @bastelflp! It seems I instead need to use the dryscrape or selenium module to interact with Javascript. From this thread it seems the requests module does not have this support. – Ian Wheeler Nov 13 '17 at 20:42
  • 1
    Try setting a different User-Agent header. Usually, the website treats the request as a non-browser one by checking HTTP headers. Default user agent of `requests` is [`python-requests/2.25.0`](https://github.com/psf/requests/blob/589c4547338b592b1fb77c65663d8aa6fbb7e38b/requests/utils.py#L808-L826). – ilyazub Jan 19 '21 at 08:37

0 Answers0