pd.read_html works in Microsoft Edge but not in Chrome

Question

This simple code works in Microsoft Edge but not in Chrome (both using Jupyter):

import pandas as pd
url_Chelsea = "https://en.wikipedia.org/wiki/List_of_Chelsea_F.C._seasons"

df_Chelsea=pd.read_html(url_Chelsea)[2]
df_Chelsea

getting the error message (end of message):

/opt/conda/lib/python3.6/site-packages/pandas/compat/__init__.py in raise_with_traceback(exc, traceback)
    338         if traceback == Ellipsis:
    339             _, _, traceback = sys.exc_info()
--> 340         raise exc.with_traceback(traceback)
    341 else:
    342     # this version of raise is a syntax error in Python 3

URLError: <urlopen error Tunnel connection failed: 403 Forbidden>

Have you tested with some other links? like other Wikipedia article also? Is this same for all other link? — imxitiz, Aug 02 '21 at 01:46
Yes, I have tried for the 4 London clubs: Chelsea, Tottenham, Arsenal and Fulham and same problem, thanks — Alan, Aug 02 '21 at 01:50
Write this : I am getting this error message by inplementing your answer @Xitiz : <> [edit] from here — imxitiz, Aug 02 '21 at 02:23
Just few seconds, getting message: It looks like your post is mostly code; please add some more details. So cannot submit. Can I submit just the plain text? will not be easy to read — Alan, Aug 02 '21 at 02:30

imxitiz · Accepted Answer · 2021-08-02T02:40:33.357

0

Try this :

import pandas as pd
import requests

url_Chelsea = "https://en.wikipedia.org/wiki/List_of_Chelsea_F.C._seasons"
proxyDict = { 
          'http'  : "add http proxy", 
          'https' : "add https proxy"
        }

requests.get(url_Chelsea , proxies=proxyDict)

df_Chelsea=pd.read_html(page)[2]
print(df_Chelsea)

For more information about proxies visit here

edited Aug 02 '21 at 02:40

answered Aug 02 '21 at 02:15

imxitiz

3,920
3
9
33

Without some more information i can't tell you, you should do this to solve your problem but anyway try this and inform . – imxitiz Aug 02 '21 at 02:15
Get a different error message. Last line: ProxyError: HTTPSConnectionPool(host='en.wikipedia.org', port=443): Max retries exceeded with url: /wiki/List_of_Chelsea_F.C._seasons (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden',))) – Alan Aug 02 '21 at 02:17
Complete error message is too long to put in a comment (600 characters max) – Alan Aug 02 '21 at 02:21
I have commented in question follow that. :) – imxitiz Aug 02 '21 at 02:24
@Alan I hadn't notice that you have edited the comment. You are using proxy. follow last edit of answer. – imxitiz Aug 02 '21 at 02:34
Running last edit of answer gives the same error message which is now shown as added pictures in the edited post, please see – Alan Aug 02 '21 at 02:39
I have edited just few second ago. I don't think you implemented that. – imxitiz Aug 02 '21 at 02:40
I still get an error message, not the same. Last 3 lines: ProxyError: HTTPSConnectionPool(host='en.wikipedia.org', port=443): Max retries exceeded with url: /wiki/List_of_Chelsea_F.C._seasons (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError(': Failed to establish a new connection: [Errno -2] Name or service not known',))) – Alan Aug 02 '21 at 02:46
I believe you have added that `add http proxy` part with proxy, right? – imxitiz Aug 02 '21 at 02:48
The proxyDict? Yes I did. For line df_Chelsea=pd.read_html(page)[2], shouldn't it be url instead of page? – Alan Aug 02 '21 at 02:50
You can, that is just the name. Is that error occurring in both browser? – imxitiz Aug 02 '21 at 02:51
Ok, thanks for your time @Xitiz, really appreciate. I will just have to work with Edge when dealing with this pandas read_html function. Best – Alan Aug 02 '21 at 02:55

pd.read_html works in Microsoft Edge but not in Chrome

1 Answers1