1

I am trying to download a dataset file via a URL,but the problem is I am receiving this message.

 File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>

Process finished with exit code 1

I am not sure what is it that I am doing wrong, since I am following a tutorial online. Here is the file I am running, everything is okay but the problem is when it comes the time to run the url for downloading.

print("download will complete at about 423 MB")


import sys

if sys.version_info[0] >= 3:
    from urllib.request import urlretrieve
else:
    from urllib import urlretrieve

url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
urlretrieve(url, filename="../enron_mail_20150507.tgz")
print("download complete!")


print()
print("unzipping Enron dataset (this may take a while)")
import tarfile
import os
os.chdir("..")
tfile = tarfile.open("enron_mail_20150507.tgz", "r:gz")
tfile.extractall(".")
Masnad Nihit
  • 1,986
  • 2
  • 21
  • 40

1 Answers1

1

Simplest solution to the problem you're trying to solve: use HTTP instead of HTTPS.

url = "http://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"

If you're more concerned about making a secure connection, then the post mentioned by @zwer is all you need. Just keep in mind that urlretrieve accepts a keyword argument context just the same way as urlopen

Marat
  • 15,215
  • 2
  • 39
  • 48