So I have been trying to download the dataset from this page with python program.
The method I have tried using were requests
and urllib.request
.
A page I used as reference to solve the SSL error but didnt work...
My code here:
import pandas as pd
import requests
import shutil
# 2017 School Quality Report
FileLink = 'https://data.cityofnewyork.us/api/views/cxrnzyvb/files/35e2893e-75ed-4449-8e7e-d6360a3386a1?download=true&filename=2017_School_Quality_Report_DD.xlsx'
requests.packages.urllib3.disable_warnings()
response = requests.get(FileLink,verify='gd_bundle-g2-g1.crt', auth=('user', 'pass'),stream = True)
response.raw.decode_content = True
with open("2017_School_Quality_Report_DD.xlsx", 'wb') as f:
shutil.copyfileobj(response.raw, f)
#import urllib.request
#urllib.request.urlretrieve(FileLink, '2017_School_Quality_Report_DD.xlsx')
data = pd.read_excel('2017_School_Quality_Report_DD.xlsx')
print(data.sheet_names)
There is this error message which I don't know what to do to solve:
SSLError: HTTPSConnectionPool(host='data.cityofnewyork.us',
port=443): Max retries exceeded with url: /api/views/cxrn-
zyvb/files/35e2893e-75ed-4449-8e7e-d6360a3386a1?
download=true&filename=2017_School_Quality_Report_DD.xlsx (Caused by
SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate
verify failed (_ssl.c:777)'),))
Please kindly let me know how I can solve the error or show me how you would do this task. I am fairly new to python. Thank you.
NOTE: found solution on this page which worked for me