Unable to download a specific webpage in Python3x with requests

Question

The following code works on several other URLs but does not work for a specific URL. Not sure why and how to workaround it? For money.usunew.com it hangs. But for all other URLs that I tried such as usatoday.com it works.

import requests

from bs4 import BeautifulSoup

url = 'https://money.usnews.com' # does NOT work for this URL but works for 'https://www.usatoday.com' 

result = requests.get(url)

src = result.content

soup = BeautifulSoup(src, 'html.parser')

print(soup.prettify())

score 0 · Answer 1 · answered Feb 13 '21 at 03:54

This is because the website is blocking the spider. You can add timeout to check it out.

result = s.post('https://money.usnews.com', timeout=15)

You got:

requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='money.usnews.com', port=443): Read timed out. (read timeout=15)

Unable to download a specific webpage in Python3x with requests

1 Answers1