converting image-based pdf to image file (png/jpg) in python

Question

I want to covert image-based PDF to image(.png/.jpg) file in Python, so I can further use this image for exacting tabular data form it. I don not want to run the code from command line.

I am currently using Python 3.7.1 version and Pycharm IDE.

I have tried the code provided on stackoverflow but nothing works, it runs but unable to extract image form image-based PDF file. Below is the link for it. Extracting images from pdf using Python

Also, tried the code from dzone.com, below is the link but nothing works https://dzone.com/articles/exporting-data-from-pdfs-with-python

Below are the image-based PDF file links:

link1: https://www.molex.com/pdm_docs/sd/190390001_sd.pdf

link2: https://www.te.com/commerce/DocumentDelivery/DDEController?Action=showdoc&DocId=Customer+Drawing%7FDT04-12PX-C015%7F-%7Fpdf%7FEnglish%7FENG_CD_DT04-12PX-C015_-.pdf%7FDT04-12PA-C015

Please suggest any solution for this.

Does this answer your question? [Convert PDF to Image using Python](https://stackoverflow.com/questions/60701262/convert-pdf-to-image-using-python) — Joe, Apr 24 '20 at 05:29
https://stackoverflow.com/questions/46184239/extract-a-page-from-a-pdf-as-a-jpeg — Joe, Apr 24 '20 at 05:30
thank you joe, this link is very helpful to me, this is what i was searching for long time — Vishal, Apr 24 '20 at 07:25
If it is a solution to your question please close / delete it. — Joe, Apr 24 '20 at 09:03

score 4 · Accepted Answer · answered Apr 24 '20 at 04:12

4

The pdf2image library converts pdf to images. As looking at your pdfs they are just images nothing else, you can convert the page to image

Install

pip install pdf2image

Once installed you can use following code to get images.

from pdf2image import convert_from_path
pages = convert_from_path('pdf_file', 500)

# Saving pages in jpeg format

for page in pages:
    page.save('out.jpg', 'JPEG')

answered Apr 24 '20 at 04:12

Kuldeep Singh Sidhu

3,748
2
12
22

hello kuldeep, I am getting above error while running the code, I have also installed the pdf2image python module, but still getting this error – Vishal Apr 24 '20 at 04:59
1

You will need `poppler`, check here: https://github.com/Belval/pdf2image – Kuldeep Singh Sidhu Apr 24 '20 at 05:17
thank you kuldeep, your code is now working fine after installing poppler – Vishal Apr 24 '20 at 07:22

converting image-based pdf to image file (png/jpg) in python

1 Answers1