I am trying to extract text from pdf using pdfminer.six library (like here), I have already installed it in my virtual environment. here is my code :
import pdfminer as miner
text = miner.high_level.extract_text('file.pdf')
print(text)
but when I execute the code with python pdfreader.py
I get the following error :
Traceback (most recent call last):
File ".\pdfreader.py", line 9, in <module>
text = miner.high_level.extract_text('pdfBulletins/corona1.pdf')
AttributeError: module 'pdfminer' has no attribute 'high_level'
I suspect it has something to do with the Python path, because I installed pdfminer
inside my virtual environment, but I see that this installed pdf2txt.py
outside in my system python install. Is this behavior normal? I mean something that happens inside my venv
should not alter my system Python installation.
I successfully extracted the text using pdf2txt.py
utility that comes with pdfminer.six
library (from command line and using system python install), but not from the code inside my venv
project. My pdfminer.six
version is 20201018
What could be the problem with my code ?