I can't get a html page with requests

Question

I would like to get an html page and read the content. I use requests (python) and my code is very simple:

import requests    
url = "http://www.romatoday.it" 
r = requests.get(url)
print r.text

when I try to do this procedure I get ever: Connection aborted.', error(110, 'Connection timed out') If I open the url in a browser all work well.

If I use requests with other url all is ok

I think is a "http://www.romatoday.it" particularity but I don't understand what is the problem. Can you help me please?

Thanks @Abdulafaj. I don't know this kind of problem. Can you explain ? thaanks again — RoverDar, Sep 06 '16 at 10:03
Thee problem isn't the comma (is a my edit mistake). The url without the comma doesn't work — RoverDar, Sep 06 '16 at 10:11
Can you try traceroute or pathping (if you're on windows) to the URL? — Simon Hibbs, Sep 06 '16 at 10:14
It's also possible the web server is blocking requests based on the user agent header which identifies the client application. Here's how to spoof it http://stackoverflow.com/questions/10606133/sending-user-agent-using-requests-library-in-python — Simon Hibbs, Sep 06 '16 at 10:20
Here's another directly relevant Q/A on this issue that might be of help. http://stackoverflow.com/questions/27422956/python-requests-library-sometimes-fails-to-open-site-that-a-browser-can-open — Simon Hibbs, Sep 06 '16 at 10:24
I think I have get the url 15, 20 times. And every time I get Connection aborted.', error(110, 'Connection timed out') — RoverDar, Sep 06 '16 at 10:30
Are you sleeping between requests? Also are you using a session or creating a new connection for each request? — Padraic Cunningham, Sep 06 '16 at 10:30
"Are you sleeping between requests? "I don't understand sorry. — RoverDar, Sep 06 '16 at 10:31
Now I have upgrade requests (2.11.1) but I have the problem again. — RoverDar, Sep 06 '16 at 10:49
Yes I'm in Django enviroment. With the new requests version I get: HTTPConnectionPool(host='www.romatoday.it', port=80): Max retries exceeded with url: /eventi/ (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 110] Connection timed out',)) — RoverDar, Sep 06 '16 at 11:01
Could be a header setting problem?This is the headers in Chrome Server: BlackStone Content-Type: text/html; charset=utf-8 Transfer-Encoding: chunked Connection: close Vary: Accept-Encoding X-Powered-By: DYNAMIC+ BlackStone (build: 40626; date: Sat, 06 Aug 2016 15:14:02 +0200; server: cn03-www1) Vary: Cookie ETag: W/"jTgH1uatCeiJCmWovJqQU5" Date: Tue, 06 Sep 2016 10:22:03 GMT Expires: Tue, 06 Sep 2016 10:40:37 GMT Cache-Control: public, max-age=1114, post-check=1114, pre-check=1114 X-XSS-Protection: 1 Content-Encoding: gzip Set-Cookie: __bs=cn03-www1|V86Yz|V86Yz; path=/; HttpOnly — RoverDar, Sep 06 '16 at 11:05

score 0 · Answer 1 · answered Sep 06 '16 at 10:02

0

Maybe the problem is that the comma here

>> url = "http://www.romatoday,it"

should be a dot

>> url = "http://www.romatoday.it"

I tried that and it worked for me

answered Sep 06 '16 at 10:02

Pani

1,317
1
14
20

1

Sorry is a my mistake. I have edit it. The url (without the comma) doesn't work again. Could be a requests module version problem? – RoverDar Sep 06 '16 at 10:07

score -1 · Answer 2 · answered Sep 06 '16 at 10:28

-1

Hmm..Have you tried other packages, not 'requests'? the code blow is same result as your code.

import urllib

url = "http://www.romatoday.it" 
r = urllib.urlopen(url)
print r.read()

a picture that I captured after running your code.

answered Sep 06 '16 at 10:28

Junsuk Park

193
1
13

requests works for me so obviously it is not requests that is the issue, also this would be better as a comment as it does not answer the question. – Padraic Cunningham Sep 06 '16 at 10:29
With urlib is the same – RoverDar Sep 06 '16 at 10:50

I can't get a html page with requests

2 Answers2