0

Playing with PDF and images in my virtualenv, and I installed pdfminer-six, pypdf2, pytesseract and pdf2image with pip:

$python3 -m venv testenv
$source testenv/bin/activate
$pip3 install pdfminer-six 
$pip3 install pypdf2 
$pip3 install pdf2image 
$pip3 install pytesseract 

but then I ran into these importing errors:

$python3

>>> import pytesseract
Traceback (most recent call last):
  File "/home/userx/test_dir/testenv/lib/python3.7/site-packages/pytesseract/pytesseract.py", line 22, in <module>
    from PIL import Image
  File "/home/userx/test_dir/testenv/lib/python3.7/site-packages/PIL/Image.py", line 94, in <module>
    from . import _imaging as core
ImportError: cannot import name '_imaging' from 'PIL' (/home/userx/test_dir/testenv/lib/python3.7/site-packages/PIL/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/userx/test_dir/testenv/lib/python3.7/site-packages/pytesseract/__init__.py", line 1, in <module>
    from .pytesseract import (  # noqa: F401
  File "/home/userx/test_dir/testenv/lib/python3.7/site-packages/pytesseract/pytesseract.py", line 24, in <module>
    import Image
ModuleNotFoundError: No module named 'Image'

and

>>> from pdf2image import convert_from_path
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/userx/test_dir/testenv/lib/python3.7/site-packages/pdf2image/__init__.py", line 5, in <module>
    from .pdf2image import (
  File "/home/userx/test_dir/testenv/lib/python3.7/site-packages/pdf2image/pdf2image.py", line 14, in <module>
    from PIL import Image
  File "/home/chengx/pdf_text/testenv/lib/python3.7/site-packages/PIL/Image.py", line 94, in <module>
    from . import _imaging as core
ImportError: cannot import name '_imaging' from 'PIL' (/home/userx/test_dir/testenv/lib/python3.7/site-packages/PIL/__init__.py)

(That said, from PyPDF2 import PdfFileReader worked well though.)

I then have Pillow installed via pip too:

$python3 -m pip install Pillow

as an attempt to fix the issue, but the import errors didn't go away. Any reason why imaging is still missing? Thank you!

I'm on CentOS8, and this is my current setup:

(testenv) [~/test_dir] python --version
Python 3.7.5+
(testenv) [~/test_dir] pip -V
pip 20.2.3 from /home/userx/test_dir/testenv/lib/python3.7/site-packages/pip (python 3.7)
(testenv) [~/test_dir] pip list
Package          Version
---------------- --------
cffi             1.14.3
chardet          3.0.4
cryptography     3.1.1
pdf2image        1.14.0
pdfminer.six     20200726
Pillow           7.2.0
pip              20.2.3
pycparser        2.20
PyPDF2           1.26.0
pytesseract      0.3.6
setuptools       41.2.0
six              1.15.0
sortedcontainers 2.2.2
xiaolong
  • 3,396
  • 4
  • 31
  • 46

0 Answers0