0

I'm trying to extract data from an online pdf files. I tried to implement this code to this url but I got urlopen error. I noticed that there is not any .pdf extension. Any suggestion?

Error

Traceback (most recent call last):
  File "C:/Users/Danial/Desktop/pdf.py", line 7, in <module>
    op = urllib2.urlopen(Request(url)).read()
  File "C:\Python27\lib\urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 431, in open
    response = self._open(req, data)
  File "C:\Python27\lib\urllib2.py", line 449, in _open
    '_open', req)
  File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 1240, in https_open
    context=self._context)
  File "C:\Python27\lib\urllib2.py", line 1197, in do_open
    raise URLError(err)
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)>

Code

import urllib2
from urllib2 import Request
from StringIO import StringIO

url = 'https://nycprop.nyc.gov/nycproperty/StatementSearch?bbl=3068690056&stmtDate=20180824&stmtType=SOA'

op = urllib2.urlopen(Request(url)).read()
memoryFile = StringIO(op)

parser = PDFParser(memoryFile)

0 Answers0