Attempting to extract a table from PDF using Python 3.6. Seems [pyPDF2][1] is failing and [pdfminer][2] is not compatible with 3.x. I found a python wrapper for [tabula][3].
import tabula
file_list = get_pdf_list()
text = tabula.read_pdf(file_list[0])
print(text)
tabula.convert_into(file_list[0], "test.json", ouput_format="json")
Both read_pdf and convert_into return empty results. PyPDF2 had the same issue. There are no errors when it runs
I'm starting to think it has to do with the format of my pdf. Anyone have more experience? I'm trying to extract a value from a table in a pdf.