5

I have created a pdf that saves several plots created using Matplotlib.

I did the following to create the pdf

from matplotlib.backends.backend_pdf import PdfPages
report = PdfPages('report.pdf')

After creating a plot, I would do this report.savefig() each time. However, I also want to output dataframes I generated into the Pdf. Essentially I want a report contain plots and queried dataframes all in one place. Is it possible to add a dataframe to the Pdf using the one created with PdfPages and if so, how would I do so? If not, is there another approach that would allow the plots and dataframe to be in once place (without having to save individual components and piecing them together)? Would love any suggestions and examples. Thanks!

Jane Sully
  • 3,137
  • 10
  • 48
  • 87

1 Answers1

5

Just create a plot of the table, then save that. Given a dataframe such as:

import pandas as pd

df = pd.DataFrame()
df['Animal'] = ['Cow', 'Bear']
df['Weight'] = [250, 450]
df['Favorite'] = ['Grass', 'Honey']
df['Least Favorite'] = ['Meat', 'Leaves']

which looks like:

  Animal  Weight Favorite Least Favorite
0    Cow     250    Grass           Meat
1   Bear     450    Honey         Leaves

you can plot a table version of it like so:

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(9,2))
ax = plt.subplot(111)
ax.axis('off')
ax.table(cellText=df.values, colLabels=df.columns, bbox=[0,0,1,1])

Output:

enter image description here

You can style the table plot a little nicer by adding some background color to the cells:

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(9,2))
ax=plt.subplot(111)
ax.axis('off')
c = df.shape[1]
ax.table(cellText=np.vstack([df.columns, df.values]), cellColours=[['lightgray']*c] + [['none']*c]*2, bbox=[0,0,1,1])

Output:

enter image description here

See this ongoing thread (from which all these examples were taken) for more ideas/variants.

Edit

It occurred to me that you might want to plot images and tables on the same figure. You can do so to get results like this:

enter image description here

Here's a link to the tutorial that image came from, which has some example code to help get you started.

tel
  • 13,005
  • 2
  • 44
  • 62
  • What happens when df size is large say 2000 – sahasrara62 Dec 18 '18 at 07:50
  • Then you'll have to be careful with your sizes, and possibly break the table out over several pages. You can size a figure to fill a sheet of regular 8.5 x 11 paper (or presumably one standard pdf page) by creating it like so: `plt.figure(figsize=(7.5, 10))`. That gives you everything but a 1 inch margin (which should be reasonable) to fill up as much as you can. – tel Dec 18 '18 at 07:54
  • I guess saving that large data in PDF handle to itself maybe. – sahasrara62 Dec 18 '18 at 08:00
  • Thanks for your help! What does the `ax.subplot(111)` do? Also, is it possible to make the resolution better? Mine is very blurry. – Jane Sully Dec 18 '18 at 08:01
  • `ax = plt.subplot(111)` is just one of the many ways to create a new `Axes` object for plotting. You could replace it with `ax = fig.gca()` and there'd be no difference in this case. – tel Dec 18 '18 at 08:13
  • Not sure why you're getting blurry output. My recommendation would be to try messing with the `figsize` argument. You could also try directly setting the global image resolution option as so: `plt.rcParams['figure.dpi'] = 200`. But that shouldn't have much effect on PDF output in any case. – tel Dec 18 '18 at 08:13
  • 1
    so how do you include these plots to a pdf file? What is the command? Is it report.savefig(ax.table()) – Nguai al Jan 24 '19 at 19:42