We get PDF files delivered to us daily and we need to get the images out. For example, what I want to do is to get the image back out of this PDF file I have, with python. Most pdf files we get are multipage and we want to export each embedded image to separate files. Most have jpeg files in them, but his one does not.
Object 5 is embedded as a zlib compressed stream. I am pretty sure it is zlib compressed because it is marked as FlateDecode and the start of the stream is \x78\x9c which is typical for zlib. You can see (part of) the hex dump here
The question is, how do I 'deflate' it and save the resulting file.
Thank you for sharing your wisdom.