113

I constructed a pandas dataframe of results. This data frame acts as a table. There are MultiIndexed columns and each row represents a name, ie index=['name1','name2',...] when creating the DataFrame. I would like to display this table and save it as a png (or any graphic format really). At the moment, the closest I can get is converting it to html, but I would like a png. It looks like similar questions have been asked such as How to save the Pandas dataframe/series data as a figure?

However, the marked solution converts the dataframe into a line plot (not a table) and the other solution relies on PySide which I would like to stay away simply because I cannot pip install it on linux. I would like this code to be easily portable. I really was expecting table creation to png to be easy with python. All help is appreciated.

Community
  • 1
  • 1
Shatnerz
  • 2,353
  • 3
  • 27
  • 43
  • 2
    One thing you could do is export it to text and save it as an image: http://stackoverflow.com/questions/17856242/convert-string-to-image-in-python You could also use webkit2png to convert html to a png: http://stackoverflow.com/questions/5633828/html-to-image-in-python Also this: http://stackoverflow.com/questions/26678467/export-a-pandas-dataframe-as-a-table-image and http://stackoverflow.com/questions/24574976/save-the-out-table-of-a-pandas-dataframe-as-a-figure – Charlie Haley Feb 25 '16 at 17:36
  • a duplicate for http://stackoverflow.com/questions/19726663/how-to-save-the-pandas-dataframe-series-data-as-a-figure/39358752#39358752 – volodymyr May 10 '17 at 16:23
  • 2
    Because no easy solution seem to exist for this problem, a fast way is to simply take a screenshot from the browser, e.g. [like this in Firefox](https://support.mozilla.org/en-US/kb/firefox-screenshots). – ImportanceOfBeingErnest Jul 15 '18 at 23:19
  • what about the latex table as a png (not latex string)? – Charlie Parker Nov 17 '21 at 17:37

13 Answers13

104

Pandas allows you to plot tables using matplotlib (details here). Usually this plots the table directly onto a plot (with axes and everything) which is not what you want. However, these can be removed first:

import matplotlib.pyplot as plt
import pandas as pd
from pandas.table.plotting import table # EDIT: see deprecation warnings below

ax = plt.subplot(111, frame_on=False) # no visible frame
ax.xaxis.set_visible(False)  # hide the x axis
ax.yaxis.set_visible(False)  # hide the y axis

table(ax, df)  # where df is your data frame

plt.savefig('mytable.png')

The output might not be the prettiest but you can find additional arguments for the table() function here. Also thanks to this post for info on how to remove axes in matplotlib.


EDIT:

Here is a (admittedly quite hacky) way of simulating multi-indexes when plotting using the method above. If you have a multi-index data frame called df that looks like:

first  second
bar    one       1.991802
       two       0.403415
baz    one      -1.024986
       two      -0.522366
foo    one       0.350297
       two      -0.444106
qux    one      -0.472536
       two       0.999393
dtype: float64

First reset the indexes so they become normal columns

df = df.reset_index() 
df
    first second       0
0   bar    one  1.991802
1   bar    two  0.403415
2   baz    one -1.024986
3   baz    two -0.522366
4   foo    one  0.350297
5   foo    two -0.444106
6   qux    one -0.472536
7   qux    two  0.999393

Remove all duplicates from the higher order multi-index columns by setting them to an empty string (in my example I only have duplicate indexes in "first"):

df.ix[df.duplicated('first') , 'first'] = '' # see deprecation warnings below
df
  first second         0
0   bar    one  1.991802
1          two  0.403415
2   baz    one -1.024986
3          two -0.522366
4   foo    one  0.350297
5          two -0.444106
6   qux    one -0.472536
7          two  0.999393

Change the column names over your "indexes" to the empty string

new_cols = df.columns.values
new_cols[:2] = '',''  # since my index columns are the two left-most on the table
df.columns = new_cols 

Now call the table function but set all the row labels in the table to the empty string (this makes sure the actual indexes of your plot are not displayed):

table(ax, df, rowLabels=['']*df.shape[0], loc='center')

et voila:

enter image description here

Your not-so-pretty but totally functional multi-indexed table.

EDIT: DEPRECATION WARNINGS

As pointed out in the comments, the import statement for table:

from pandas.tools.plotting import table

is now deprecated in newer versions of pandas in favour of:

from pandas.plotting import table 

EDIT: DEPRECATION WARNINGS 2

The ix indexer has now been fully deprecated so we should use the loc indexer instead. Replace:

df.ix[df.duplicated('first') , 'first'] = ''

with

df.loc[df.duplicated('first') , 'first'] = ''
bunji
  • 5,063
  • 1
  • 17
  • 36
  • 2
    This gets me much closer. Sadly I saw something like this before, but I just realized it wasn't working because I was using an outdated version of pandas. This seems to work well except for 2 things. 1) Part of the table appears to be always out of frame. I tried `table(ax, df, loc='center')` which helped but the indexes on the left were cut in half. If I use `plt.show()` this is fixed as soon as there is a resize of the window. 2) `table` does not seem to handle multiindex columns. My columns appear as ('A', '1'), ('A', '2') instead of 2 rows were 'A' is on top and spans over '1' and '2'. – Shatnerz Mar 01 '16 at 14:16
  • For help with out of frame data, check out the suggestions [here](http://stackoverflow.com/questions/17232683/creating-tables-in-matplotlib), especially the "Simple Way" in the answer by @FrancescoMontesano . For the multi-index problem, you might be out of luck. If I come across something I'll let you know. – bunji Mar 01 '16 at 14:36
  • 1
    To solve my first issue, `plt.savefig('test.png', bbox_inches='tight')` works. The multiindex thing isn't a huge issue. I'm just surprised no one has created an easy way to save tables as images. It makes me feel like there is some better way that I'm just missing entirely. I suppose I could try writing something for pandas when I have the time. – Shatnerz Mar 01 '16 at 15:06
  • @Shatners . I've added a possible fix for your multi-index issues. It's not the prettiest but it gets the job done. – bunji Mar 02 '16 at 01:04
  • 4
    We should notice **a FutureWarning** that `pandas.tools.plotting.table` is deprecated, import `pandas.plotting.table` instead. – Bowen Peng Apr 23 '19 at 13:04
  • @Mitchell. Please see if the additional information about the deprecation of the `ix` indexer solves your problem. If it doesn't, please provide some details regarding what specifically isn't working. This will help people who may be having the same issue. – bunji Feb 13 '20 at 18:01
  • Hi, could You please clean this answer from dependencies that are deprecated (in the first block of code). – B.Kocis Mar 30 '21 at 07:56
  • @B.Kocis Thanks for the suggestion. I would prefer to leave the history of the answer intact. Please see the EDIT sections to see details about deprecations. – bunji Apr 05 '21 at 20:11
  • I wish there was a way to export the jupyter output as png, because that looks much prettier. –  Oct 15 '21 at 14:56
  • what about the latex table as a png (not latex string)? – Charlie Parker Nov 17 '21 at 17:37
81

There is actually a python library called dataframe_image Just do a

pip install dataframe_image

Do the imports

import pandas as pd
import numpy as np
import dataframe_image as dfi
df = pd.DataFrame(np.random.randn(6, 6), columns=list('ABCDEF'))

and style your table if you want by:

df_styled = df.style.background_gradient() #adding a gradient based on values in cell

and finally:

dfi.export(df_styled,"mytable.png")
  • 8
    A simple, elegant answer! – Woden Oct 05 '20 at 17:26
  • 1
    I tried this and working fine if I add table_conversion = 'matplotlib' option to export, as I have problem with Chrome in WSL. If I remove matplotlib option, I got below error, I need to add no-sandbox option when calling Chrome but I can't find any documents how to add this. Failed to move to new namespace: PID namespaces supported, Network namespace supported, but failed: errno = Permission denied – GurhanCagin Nov 01 '20 at 10:09
  • This worked well for me running JupyterLab in Edge, though had to add `max_rows=-1, max_cols=-1` due to the large size of my styled dataframe – dreme May 20 '21 at 01:20
  • 2
    An image to show how it looks like would have been helpful. –  Oct 15 '21 at 14:54
  • what about the latex table as a png (not latex string)? – Charlie Parker Nov 17 '21 at 17:37
  • 15
    `OSError: Chrome executable not able to be found on your machine` Hmm... not sure I won't to install chrome on my upyter compute instance. – Att Righ Feb 09 '22 at 16:20
  • Hi guys, I'm not able to create the image with all the texts aligned centrally, could you help me? https://stackoverflow.com/questions/71199088/how-to-center-text-of-image-created-from-a-dataframe – Digital Farmer Feb 21 '22 at 01:58
  • 1
    Perfect, thx, but how to change the quality of the export... ? – Xomuama May 06 '22 at 11:04
  • @Xomuama have you tried changing `font_size` parameter? – Lê Quang Duy May 24 '22 at 07:41
  • Not working on my end. I am getting the following error: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmprvk_t667/temp.png – Amir saleem Feb 20 '23 at 05:09
38

The best solution to your problem is probably to first export your dataframe to HTML and then convert it using an HTML-to-image tool. The final appearance could be tweaked via CSS.

Popular options for HTML-to-image rendering include:


Let us assume we have a dataframe named df. We can generate one with the following code:

import string
import numpy as np
import pandas as pd


np.random.seed(0)  # just to get reproducible results from `np.random`
rows, cols = 5, 10
labels = list(string.ascii_uppercase[:cols])
df = pd.DataFrame(np.random.randint(0, 100, size=(5, 10)), columns=labels)
print(df)
#     A   B   C   D   E   F   G   H   I   J
# 0  44  47  64  67  67   9  83  21  36  87
# 1  70  88  88  12  58  65  39  87  46  88
# 2  81  37  25  77  72   9  20  80  69  79
# 3  47  64  82  99  88  49  29  19  19  14
# 4  39  32  65   9  57  32  31  74  23  35

Using WeasyPrint

This approach uses a pip-installable package, which will allow you to do everything using the Python ecosystem. One shortcoming of weasyprint is that it does not seem to provide a way of adapting the image size to its content. Anyway, removing some background from an image is relatively easy in Python / PIL, and it is implemented in the trim() function below (adapted from here). One also would need to make sure that the image will be large enough, and this can be done with CSS's @page size property.

The code follows:

import weasyprint as wsp
import PIL as pil


def trim(source_filepath, target_filepath=None, background=None):
    if not target_filepath:
        target_filepath = source_filepath
    img = pil.Image.open(source_filepath)
    if background is None:
        background = img.getpixel((0, 0))
    border = pil.Image.new(img.mode, img.size, background)
    diff = pil.ImageChops.difference(img, border)
    bbox = diff.getbbox()
    img = img.crop(bbox) if bbox else img
    img.save(target_filepath)


img_filepath = 'table1.png'
css = wsp.CSS(string='''
@page { size: 2048px 2048px; padding: 0px; margin: 0px; }
table, td, tr, th { border: 1px solid black; }
td, th { padding: 4px 8px; }
''')
html = wsp.HTML(string=df.to_html())
html.write_png(img_filepath, stylesheets=[css])
trim(img_filepath)

table_weasyprint


Using wkhtmltopdf/wkhtmltoimage

This approach uses an external open source tool and this needs to be installed prior to the generation of the image. There is also a Python package, pdfkit, that serves as a front-end to it (it does not waive you from installing the core software yourself), but I will not use it.

wkhtmltoimage can be simply called using subprocess (or any other similar means of running an external program in Python). One would also need to output to disk the HTML file.

The code follows:

import subprocess


df.to_html('table2.html')
subprocess.call(
    'wkhtmltoimage -f png --width 0 table2.html table2.png', shell=True)

table_wkhtmltoimage

and its aspect could be further tweaked with CSS similarly to the other approach.


norok2
  • 25,683
  • 4
  • 73
  • 99
25

Although I am not sure if this is the result you expect, you can save your DataFrame in png by plotting the DataFrame with Seaborn Heatmap with annotations on, like this:

http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.heatmap.html#seaborn.heatmap

Example of Seaborn heatmap with annotations on

It works right away with a Pandas Dataframe. You can look at this example: Efficiently ploting a table in csv format using Python

You might want to change the colormap so it displays a white background only.

Hope this helps.

Edit: Here is a snippet that does this:

import matplotlib
import seaborn as sns

def save_df_as_image(df, path):
    # Set background to white
    norm = matplotlib.colors.Normalize(-1,1)
    colors = [[norm(-1.0), "white"],
            [norm( 1.0), "white"]]
    cmap = matplotlib.colors.LinearSegmentedColormap.from_list("", colors)
    # Make plot
    plot = sns.heatmap(df, annot=True, cmap=cmap, cbar=False)
    fig = plot.get_figure()
    fig.savefig(path)
Jacob Stern
  • 3,758
  • 3
  • 32
  • 54
jcdoming
  • 351
  • 3
  • 10
  • 1
    I need to read through those links, but I am not looking to plot the data. I simply want an image of the table, much like what you would see with `df.to_html()`. Some columns consist of strings for names and such – Shatnerz Feb 25 '16 at 20:22
  • Decent suggestion -- I forgot about heatmaps when thinking about this question – zthomas.nc May 31 '17 at 19:42
11

The solution of @bunji works for me, but default options don't always give a good result. I added some useful parameter to tweak the appearance of the table.

import pandas as pd
import matplotlib.pyplot as plt
from pandas.plotting import table
import numpy as np

dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))

df.index = [item.strftime('%Y-%m-%d') for item in df.index] # Format date

fig, ax = plt.subplots(figsize=(12, 2)) # set size frame
ax.xaxis.set_visible(False)  # hide the x axis
ax.yaxis.set_visible(False)  # hide the y axis
ax.set_frame_on(False)  # no visible frame, uncomment if size is ok
tabla = table(ax, df, loc='upper right', colWidths=[0.17]*len(df.columns))  # where df is your data frame
tabla.auto_set_font_size(False) # Activate set fontsize manually
tabla.set_fontsize(12) # if ++fontsize is necessary ++colWidths
tabla.scale(1.2, 1.2) # change size table
plt.savefig('table.png', transparent=True)

The result: Table

Noob Geek
  • 409
  • 6
  • 20
jrovegno
  • 699
  • 5
  • 11
9

I had the same requirement for a project I am doing. But none of the answers came elegant to my requirement. Here is something which finally helped me, and might be useful for this case:

from bokeh.io import export_png, export_svgs
from bokeh.models import ColumnDataSource, DataTable, TableColumn

def save_df_as_image(df, path):
    source = ColumnDataSource(df)
    df_columns = [df.index.name]
    df_columns.extend(df.columns.values)
    columns_for_table=[]
    for column in df_columns:
        columns_for_table.append(TableColumn(field=column, title=column))

    data_table = DataTable(source=source, columns=columns_for_table,height_policy="auto",width_policy="auto",index_position=None)
    export_png(data_table, filename = path)

enter image description here

raghavsikaria
  • 867
  • 17
  • 30
9

There is a Python library called df2img available at https://pypi.org/project/df2img/ (disclaimer: I'm the author). It's a wrapper/convenience function using plotly as backend.

You can find the documentation at https://df2img.dev.

import pandas as pd

import df2img

df = pd.DataFrame(
    data=dict(
        float_col=[1.4, float("NaN"), 250, 24.65],
        str_col=("string1", "string2", float("NaN"), "string4"),
    ),
    index=["row1", "row2", "row3", "row4"],
)

Saving a pd.DataFrame as a .png-file can be done fairly quickly. You can apply formatting, such as background colors or alternating the row colors for better readability.

fig = df2img.plot_dataframe(
    df,
    title=dict(
        font_color="darkred",
        font_family="Times New Roman",
        font_size=16,
        text="This is a title",
    ),
    tbl_header=dict(
        align="right",
        fill_color="blue",
        font_color="white",
        font_size=10,
        line_color="darkslategray",
    ),
    tbl_cells=dict(
        align="right",
        line_color="darkslategray",
    ),
    row_fill_color=("#ffffff", "#d7d8d6"),
    fig_size=(300, 160),
)

df2img.save_dataframe(fig=fig, filename="plot.png")

pd.DataFrame png file

Andi
  • 3,196
  • 2
  • 24
  • 44
4

If you're okay with the formatting as it appears when you call the DataFrame in your coding environment, then the absolute easiest way is to just use print screen and crop the image using basic image editing software.

Here's how it turned out for me using Jupyter Notebook, and Pinta Image Editor (Ubuntu freeware).

Tom Dixon
  • 96
  • 1
  • 9
  • This is honestly the most elegant solution. The styled tables look quite nice, and the to_html() solution, while very simple, doesn't maintain the style. – June Skeeter Mar 23 '18 at 11:21
  • 1
    pretty hard to automate a lot of this though – baxx Dec 09 '20 at 14:19
3

The following would need extensive customisation to format the table correctly, but the bones of it works:

import numpy as np
from PIL import Image, ImageDraw, ImageFont
import pandas as pd

df = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'C' : np.array([3] * 4,dtype='int32'),
                     'D' : pd.Categorical(["test","train","test","train"]),
                     'E' : 'foo' })


class DrawTable():
    def __init__(self,_df):
        self.rows,self.cols = _df.shape
        img_size = (300,200)
        self.border = 50
        self.bg_col = (255,255,255)
        self.div_w = 1
        self.div_col = (128,128,128)
        self.head_w = 2
        self.head_col = (0,0,0)
        self.image = Image.new("RGBA", img_size,self.bg_col)
        self.draw = ImageDraw.Draw(self.image)
        self.draw_grid()
        self.populate(_df)
        self.image.show()
    def draw_grid(self):
        width,height = self.image.size
        row_step = (height-self.border*2)/(self.rows)
        col_step = (width-self.border*2)/(self.cols)
        for row in range(1,self.rows+1):
            self.draw.line((self.border-row_step//2,self.border+row_step*row,width-self.border,self.border+row_step*row),fill=self.div_col,width=self.div_w)
            for col in range(1,self.cols+1):
                self.draw.line((self.border+col_step*col,self.border-col_step//2,self.border+col_step*col,height-self.border),fill=self.div_col,width=self.div_w)
        self.draw.line((self.border-row_step//2,self.border,width-self.border,self.border),fill=self.head_col,width=self.head_w)
        self.draw.line((self.border,self.border-col_step//2,self.border,height-self.border),fill=self.head_col,width=self.head_w)
        self.row_step = row_step
        self.col_step = col_step
    def populate(self,_df2):
        font = ImageFont.load_default().font
        for row in range(self.rows):
            print(_df2.iloc[row,0])
            self.draw.text((self.border-self.row_step//2,self.border+self.row_step*row),str(_df2.index[row]),font=font,fill=(0,0,128))
            for col in range(self.cols):
                text = str(_df2.iloc[row,col])
                text_w, text_h = font.getsize(text)
                x_pos = self.border+self.col_step*(col+1)-text_w
                y_pos = self.border+self.row_step*row
                self.draw.text((x_pos,y_pos),text,font=font,fill=(0,0,128))
        for col in range(self.cols):
            text = str(_df2.columns[col])
            text_w, text_h = font.getsize(text)
            x_pos = self.border+self.col_step*(col+1)-text_w
            y_pos = self.border - self.row_step//2
            self.draw.text((x_pos,y_pos),text,font=font,fill=(0,0,128))
    def save(self,filename):
        try:
            self.image.save(filename,mode='RGBA')
            print(filename," Saved.")
        except:
            print("Error saving:",filename)




table1 = DrawTable(df)
table1.save('C:/Users/user/Pictures/table1.png')

The output looks like this:

enter image description here

Colin Dickie
  • 910
  • 4
  • 9
3

As jcdoming suggested, use Seaborn heatmap():

import seaborn as sns
import matplotlib.pyplot as plt

fig = plt.figure(facecolor='w', edgecolor='k')
sns.heatmap(df.head(), annot=True, cmap='viridis', cbar=False)
plt.savefig('DataFrame.png')

DataFrame as a heat map

Alon Lavian
  • 1,149
  • 13
  • 14
3

The easiest and fastest way to convert a Pandas dataframe into a png image using Anaconda Spyder IDE- just double-click on the dataframe in variable explorer, and the IDE table will appear, nicely packaged with automatic formatting and color scheme. Just use a snipping tool to capture the table for use in your reports, saved as a png:

2020 Blue Chip Ratio

This saves me lots of time, and is still elegant and professional.

Carlo Carandang
  • 187
  • 1
  • 8
1

People who use Plotly for data visualization:

  • You can easily convert the dataframe to go.Table.

  • You can save the dataframe with columns names.

  • You can format the dataframe through go.Table.

  • You can save the dataframe as pdf, jpg, or png with different scales and high resolution.

     import plotly.express as px
    
     df = px.data.medals_long()
    
     fig = go.Figure(data=[
                         go.Table(
                            header=dict(values=list(df.columns),align='center'),
                            cells=dict(values=df.values.transpose(),
                                       fill_color = [["white","lightgrey"]*df.shape[0]],
                                       align='center'
                                      )
                                )
                           ])
     fig.write_image('image.png',scale=6)
    

Note: the image is downloaded in the same directory where the current python file is running.

Output:

enter image description here

Hamzah
  • 8,175
  • 3
  • 19
  • 43
-1

I really like the way Jupyter notebooks format the DataFrame and this library exports it in the same format:

import dataframe_image as dfi
dfi.export(df, "df.png")

There is also a dpi argument in case you want to increase the quality of the image. I'd recommend 300 for an ok quality, 600 for exelent, 1200 for perfect and more than that is probably too much.

import dataframe_image as dfi
dfi.export(df, "df.png", dpi = 600)
Gustavo
  • 3
  • 3