0

So I have been trying to download the dataset from this page with python program.

The method I have tried using were requests and urllib.request.

A page I used as reference to solve the SSL error but didnt work...

My code here:

import pandas as pd
import requests
import shutil

# 2017 School Quality Report
FileLink = 'https://data.cityofnewyork.us/api/views/cxrnzyvb/files/35e2893e-75ed-4449-8e7e-d6360a3386a1?download=true&filename=2017_School_Quality_Report_DD.xlsx'

requests.packages.urllib3.disable_warnings()

response = requests.get(FileLink,verify='gd_bundle-g2-g1.crt', auth=('user', 'pass'),stream = True)
response.raw.decode_content = True
with open("2017_School_Quality_Report_DD.xlsx", 'wb') as f:
    shutil.copyfileobj(response.raw, f)      

 #import urllib.request
 #urllib.request.urlretrieve(FileLink, '2017_School_Quality_Report_DD.xlsx')                      
data = pd.read_excel('2017_School_Quality_Report_DD.xlsx')
print(data.sheet_names)

There is this error message which I don't know what to do to solve:

SSLError: HTTPSConnectionPool(host='data.cityofnewyork.us', 
port=443): Max retries exceeded with url: /api/views/cxrn-
zyvb/files/35e2893e-75ed-4449-8e7e-d6360a3386a1?
download=true&filename=2017_School_Quality_Report_DD.xlsx (Caused by 
SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate 
verify failed (_ssl.c:777)'),))

Please kindly let me know how I can solve the error or show me how you would do this task. I am fairly new to python. Thank you.

NOTE: found solution on this page which worked for me

raffa
  • 145
  • 2
  • 11
  • I think it does not work because you explicitly verify with your certificate. Why do you do that if you disable warnings? Usually I disable the warning like this: ```from requests.packages.urllib3.exceptions import InsecureRequestWarning requests.packages.urllib3.disable_warnings(InsecureRequestWarning)``` – C. Yduqoli Jan 12 '18 at 01:57
  • @C.Yduqoli Hi thanks for the reply! I have also tried `verify = False`(didnt work for me) and this way of disabling warning I found from the web too, I just tried your code still giving error oh well.. I am kind of wondering can it be the problem of the link itself? – raffa Jan 12 '18 at 02:20
  • Do you have another url to test? For me, your url just yields a 404 error. ```>>> requests.get("https://data.cityofnewyork.us/api/views/cxrnzyvb/files/35e2893e-75ed-4449-8e7e-d6360a3386a1?download=true&filename=2017_School_Quality_Report_DD.xlsx") ``` – C. Yduqoli Jan 12 '18 at 02:33
  • @C.Yduqoli try this link? https://community.tableau.com/servlet/JiveServlet/downloadBody/1236-102-2-15278/Sample%20-%20Superstore.xls – raffa Jan 12 '18 at 03:09
  • @C.Yduqoli I just managed to get it to work with the `urlretrieve` command, it turned out the solution was to install `Certificates.command` in `Applications/Python 3.6`. Not sure about the `requests.get` part yet, it retrieves a broken file somehow – raffa Jan 12 '18 at 03:19
  • I have no problems downloading this link using requests.get(). there must be something wrong with your setup. – C. Yduqoli Jan 12 '18 at 08:23

0 Answers0