Here is what is happening...
With this line:
pages = convert_from_path(path_to_pdf, output_folder=path_to_output, poppler_path=poppler_path)
You are actually doing 2 things:
- writing
.ppm
files to the output folders, and
- loading the
pages
, which are PIL.PpmImagePlugin.PpmImageFile
objects.
The actual saving of the object to a JPEG is made after, with
pages[i].save('page' + str(i) + '.jpg', 'JPEG')
This means that to obtain the result you want to obtain, you just have to avoid providing the output_folder
in the convert_from_path
function and provide it while saving instead, as such:
import os
from pdf2image import convert_from_path
pages = convert_from_path(path_to_pdf, poppler_path=poppler_path)
for i in range(len(pages)):
print(type(pages[i]))
pages[i].save(os.path.join(path_to_output, 'page' + str(i) + '.jpg'), format='JPEG')