I'm trying to extract data from an online pdf files. I tried to implement this code to this url but I got urlopen error. I noticed that there is not any .pdf extension. Any suggestion?
Error
Traceback (most recent call last):
File "C:/Users/Danial/Desktop/pdf.py", line 7, in <module>
op = urllib2.urlopen(Request(url)).read()
File "C:\Python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 431, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 449, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1240, in https_open
context=self._context)
File "C:\Python27\lib\urllib2.py", line 1197, in do_open
raise URLError(err)
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)>
Code
import urllib2
from urllib2 import Request
from StringIO import StringIO
url = 'https://nycprop.nyc.gov/nycproperty/StatementSearch?bbl=3068690056&stmtDate=20180824&stmtType=SOA'
op = urllib2.urlopen(Request(url)).read()
memoryFile = StringIO(op)
parser = PDFParser(memoryFile)