I am trying to extract text from pdf using pdfminer.six
, I followed below code as mentioned here
import pdfminer
import io
def extract_raw_text(pdf_filename):
output = io.StringIO()
laparams = pdfminer.layout.LAParams()
with open(pdf_filename, "rb") as pdffile:
pdfminer.high_level.extract_text_to_fp(pdffile, output, laparams=laparams)
return output.getvalue()
print(extract_raw_text('simple1.pdf'))
But it is producing an error
Traceback (most recent call last):
File "extract.py", line 13, in <module>
print(extract_raw_text('simple1.pdf'))
File "extract.py", line 6, in extract_raw_text
laparams = pdfminer.layout.LAParams()
AttributeError: module 'pdfminer' has no attribute 'layout'
I simply wants to extract entire text from pdf, any help would be appreciated.