2
from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import DirectoryLoader

import openai
loader = DirectoryLoader('source', glob='*.pdf')

data = loader.load()

Just this much code... I get this error

  File "C:\Users\vsvrp\anaconda3\envs\GPTtrail2\lib\site-packages\pptx\parts\image.py", line 13, in <module>
    import Image as PIL_Image
ModuleNotFoundError: No module named 'Image'

Process finished with exit code 1

I do not get this error if I do this

loader = UnstructuredPDFLoader("DOStest.pdf")

I tried to do pip install Image

It is still not working. Any help would be greatly appreciated.

Working with langchain and documentloaders for the first time and the DirectoryLoader class is supposed to work in this case.

James Z
  • 12,209
  • 10
  • 24
  • 44
  • Did you try `pip install Pillow`? The pptx package first tries to import from Pillow: https://github.com/scanny/python-pptx/blob/71d1ca0b2b3b9178d64cdab565e8503a25a54e0b/pptx/parts/image.py#L10-L13 – dcferreira Mar 30 '23 at 23:18

0 Answers0