Requests.get not working with certain URLs

Question

I’m trying to make a Get request using Python's Request library. I can run the quickstart example fine, but when I change the URL the code does not return for a long time, and finally returns a number of errors that reference lines deep in the requests library code. I’ve been trying to google this but it is beyond my beginner’s understanding. Is there some limitation on the syntax of URLs passed to requests.get() ? Here is the code with a URL that is not working:

import requests

URL = 'https://www.landsofamerica.com/United-States/lakefront-property/'
r = requests.get(URL)
print(r.text)

"""
NOTE: This code taken from https://requests.readthedocs.io/en/master/user/quickstart/#make-a-request

The example code in the docs *does* execute correctly with this example URL:
URL = 'https://api.github.com/events'

"""

The errors returned are quite long and I don’t know how to find the “most relevant parts” to ask for help, so I did not think I should paste all of that here? Thanks.

score 1 · Answer 1 · answered Oct 18 '20 at 18:24

That site is probably blocking scrape requests.

Use the headers collection to mimic a browser.

import requests

URL = 'https://www.landsofamerica.com/United-States/lakefront-property'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

r = requests.get(URL, headers=headers)
print(r.text)

Additional information: How to use Python requests to fake a browser visit?

Requests.get not working with certain URLs

1 Answers1