1

I have a jupyter notebook running python and I want all the graphs that I construct outputted to one pdf file. Does anybody know how to do this?

Kind regards,

wokter
  • 226
  • 3
  • 14
  • If you want the entire notebook (code, texts and results) to be included, you can use `nbconvert`, if you really want only the images, you can use the documents editor like Microsoft Word, just right-click on the image, copy and paste it in Word (at least this works in Google Colab). – Dhana D. Aug 06 '21 at 08:42
  • I want only the graphs, but without copy pasting them all. because it's a monthly process. – wokter Aug 06 '21 at 08:55
  • why not simply call `fig.savefig("filepath/filename.jpg")` after generating the figure? – raphael Aug 06 '21 at 09:08

2 Answers2

1

If you are on windows, you could use excel with openpyxl to do this and then print the excel (manually or via python as a background process) to PDF.

Save your images when you create them using plt.save() if you are using matplotlib.

To insert an image in an excel workbook:

from openpyxl.drawing.image import Image as XLIMG
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
img = XLIMG('example_image.jpg')
# Note that A1 is the cell, you can specify exactly where you want the images.
ws.add_image(img, 'A1')
wb.save('workbook_test')

What I often do is make a sort of excel template with titles and information and then load my data or images in this template, then print to PDF using python and upload to server. Cool thing is that if you need changes in the report, you often do not need to code anything, but just edit the excel template.

To print an excel workbook in background to PDF:Print chosen worksheets in excel files to pdf in python

Ruben
  • 187
  • 1
  • 7
  • yes, the graphs are made with matplotlib. I want just the graphs. – wokter Aug 06 '21 at 08:54
  • 1
    Updated my answer :) There are other ways without using excel obviously, but I find that excel actually gives me a lot of control by letting me chose the cell. What I often do is make a sort of excel template with titles and information and then load my data or images in this template, then print to PDF using python and upload to server. Cool thing is that if you need changes in the report, you often do not need to code anything, but just edit the excel template. – Ruben Aug 06 '21 at 09:11
  • 1
    very cool way to do this via excel. Thanks!! – wokter Aug 06 '21 at 11:07
1

I could not find some way to directly convert all the plots in pdf but this way I am going to tell is perfectly working and tested. First, you need to import matplotlib.

import matplotlib.pyplot as plt

Now write this code in all the blocks of jupyter notebook which are plotting some graphs. Images name should be different for all blocks, you can save as plt1.png , plt2.png etc.

plt.save('imagename.png')

please note imagename should be different for all blocks.

Don't worry you would not be left with plot images. Now run this code in the last of the notebook.

from PIL import Image
import glob
import os
images = glob.glob("*.png")
print(images)
imlist = []
for img in images:
    im = Image.open(img)
    im = im.convert('RGB')
    imlist.append(im)
imlist[0].save('plots.pdf',save_all=True, append_images=imlist[1:])
map(os.remove(img),[img for img in images])

What is basically being here, you are saving images of all plots one by one then this code will convert all those images into pdf, and os.remove will remove all the taken images. Note: you should not have other png files in the same folder otherwise those will be also included in pdf and get removed as well.

  • If `plt.save('imagename.png')` gives you an error, try using `plt.savefig('imagename.png')` instead – Hadi Mir Jun 01 '22 at 10:49