0

I found a way to convert PDF file to JPG, actually to extract image files from PDF file. I have managed to do that with PyMuPDF lib. This is the documentation for that lib:

https://pymupdf.readthedocs.io/en/latest/

I have seen this code:

Extract images from PDF without resampling, in python?

and this code:

https://www.thepythoncode.com/article/extract-pdf-images-in-python

I wrote a code, that does not give me any errors, this is the code:

import fitz
import cv2
import numpy as np


doc = fitz.open("sample15.pdf")
#print(doc)

my_images = []
for i in range(len(doc)):

    for img in doc.getPageImageList(i):
        xref = img[0]

        img = doc.extractImage(xref)
        img = img["image"]

        nparr = np.frombuffer(img, np.uint8)
        img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
        my_images.append(img_np)

As you can see, I do not have print function anywhere, but my program prints this:

mupdf: expected object number #this is printed red
xref 9 image type jpeg
xref 12 image type jpeg
xref 15 image type jpeg
xref 18 image type jpeg
xref 21 image type jpeg
xref 24 image type jpeg

Why do I get this print output, how can I remove it? I guess that Its coming from the lib, but I do not know how to stop it

taga
  • 3,537
  • 13
  • 53
  • 119

1 Answers1

2

That output probably comes from one of the libraries you're using. You could look in their docs to figure out if there's a logging level option, or as a last-ditch "fix", use the contextlib.redirect_stdout (and .redirect_stderr) context managers to hide the output.

AKX
  • 152,115
  • 15
  • 115
  • 172
  • I have updated the question a little bit, can you show me how can I remove these automatic prints? I have looked at documentation but I didnt find any solutions – taga Oct 27 '20 at 12:59
  • Did you read the second half of my answer? Did you click through to the documentation? It has a direct example. – AKX Oct 27 '20 at 13:05