AFAIK, Google colab is running a Ubuntu operating system, you can discover that by running the uname -a
command.
If you build poppler, the pdf* binaries are installed in /usr/bin
and pdf2image can resolve them automatically.
Discover the operating system name.
!uname -a;
Linux d9b9a62155f2 5.10.133+ #1 SMP Fri Aug 26 08:44:51 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
!cat requirements.txt
pdf2image
Install python dependencies
!pip install -r requirements.txt
Install some dependencies for building poppler
!apt update
!apt-get install libnss3 libnss3-dev
!apt-get install libcairo2-dev libjpeg-dev libgif-dev
!apt-get install cmake libblkid-dev e2fslibs-dev libboost-all-dev libaudit-dev
Download and extract poppler source code.
!wget https://poppler.freedesktop.org/poppler-21.09.0.tar.xz;
!tar -xvf poppler-21.09.0.tar.xz;
Compile and install poppler.
!mkdir -p poppler-21.09.0/build && \
cd poppler-21.09.0 && \
cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/usr \
-DTESTDATADIR=$PWD/testfiles \
-DENABLE_UNSTABLE_API_ABI_HEADERS=ON && \
make && \
make install
Work with the PDF file
from pdf2image import convert_from_path, convert_from_bytes
images = convert_from_path('sample.pdf', poppler_path='/usr/bin/')