1

Here is the code to reproduce it:

import pandas as pd
url = 'https://info.gesundheitsministerium.gv.at/data/timeline-faelle-bundeslaender.csv'
df = pd.read_csv(url)

it fails with the following traceback:

URLError: <urlopen error [SSL: DH_KEY_TOO_SMALL] dh key too small (_ssl.c:1129)>

Here is a link to check the url. The download works from the same browser, if you embed the link on a markdown cell in Jupyter.

Any ideas to make this "just work"?

Update:

as per the question suggested by Florin C. below,

This solution solves the issue when downloading via requests:

import requests
import urllib3
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'ALL:@SECLEVEL=1'

requests.get(url)

It would be just a matter of forcing Pandas to to the same, somehow.

My Environment:

Python implementation: CPython
Python version       : 3.9.7
IPython version      : 7.28.0

requests  : 2.25.1
seaborn   : 0.11.2
json      : 2.0.9
numpy     : 1.20.3
plotly    : 5.4.0
matplotlib: 3.5.0
lightgbm  : 3.3.1
pandas    : 1.3.4

Watermark: 2.2.0
fccoelho
  • 6,012
  • 10
  • 55
  • 67
  • 1
    Looks the same as this issue: https://stackoverflow.com/questions/38015537/python-requests-exceptions-sslerror-dh-key-too-small – Florin C. Dec 01 '21 at 09:09
  • That is for downloading with requests, the error is the same. But Pandas should be able to handle this transparently. – fccoelho Dec 01 '21 at 09:14
  • Worked for me in Jupyter running on Ubuntu 18. Would a workaround of downloading the csv first then then reading it with pandas be acceptable? Otherwise I'd suggest you check your OpenSSL version - apparently the website accepts weak DH keys, but newer OpenSSL does not. – Tõnis Piip Dec 01 '21 at 09:31
  • could not reproduce the issue.. – JacoSolari Dec 01 '21 at 09:38
  • It's funny because my browser can handle it normally (which is an easy workaround), and it should be using the same OS OpenSSL version as Pandas. So I think this is a Pandas issue about how it handles SSL. requests gives the same error. – fccoelho Dec 01 '21 at 09:47
  • If you're running this in a virtual environment, then it's also possible Python is referencing an older version of OpenSSL. Check the OS OpenSSL version and the one Python is using. – Tõnis Piip Dec 01 '21 at 09:53
  • I am not on a virtualenv – fccoelho Dec 01 '21 at 18:22

1 Answers1

1

Here is a solution if you need to automate the process and don't want to have to download the csv first and then read from file.

import requests
import urllib3
import io
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'ALL:@SECLEVEL=1'

res = requests.get(url)
pd.read_csv(io.BytesIO(res.content), sep=';')

It should be noted that it may not be safe to change the default cyphers to SECLEVEL=1 at the OS level. But this temporary change should be ok.

fccoelho
  • 6,012
  • 10
  • 55
  • 67