149

I have two pandas dataframes and I would like to display them in Jupyter notebook.

Doing something like:

display(df1)
display(df2)

Shows them one below another:

enter image description here

I would like to have a second dataframe on the right of the first one. There is a similar question, but it looks like there a person is satisfied either with merging them in one dataframe of showing the difference between them.

This will not work for me. In my case dataframes can represent completely different (non-comparable elements) and the size of them can be different. Thus my main goal is to save space.

Community
  • 1
  • 1
Salvador Dali
  • 214,103
  • 147
  • 703
  • 753

13 Answers13

196

I have ended up writing a function that can do this: [update: added titles based on suggestions (thnx @Antony_Hatchkins et al.)]

from IPython.display import display_html
from itertools import chain,cycle
def display_side_by_side(*args,titles=cycle([''])):
    html_str=''
    for df,title in zip(args, chain(titles,cycle(['</br>'])) ):
        html_str+='<th style="text-align:center"><td style="vertical-align:top">'
        html_str+=f'<h2 style="text-align: center;">{title}</h2>'
        html_str+=df.to_html().replace('table','table style="display:inline"')
        html_str+='</td></th>'
    display_html(html_str,raw=True)
  

Example usage:

df1 = pd.DataFrame(np.arange(12).reshape((3,4)),columns=['A','B','C','D',])
df2 = pd.DataFrame(np.arange(16).reshape((4,4)),columns=['A','B','C','D',])
display_side_by_side(df1,df2,df1, titles=['Foo','Foo Bar']) #we left 3rd empty...

enter image description here

ntg
  • 12,950
  • 7
  • 74
  • 95
  • This is really great, thanks. How easy or otherwise would it be to add the data frame name above each output, do you think? – Ricky McMaster Aug 17 '17 at 13:06
  • Thanks for your answer, I've [added headers](https://stackoverflow.com/a/50908230/237105) to it in a manner similar to what you've described in your last comment. – Antony Hatchkins Jun 18 '18 at 11:37
  • Amazing answer. This is what I'm looking for as well. I'm still learning my way around it, so I want to know: 1) Why did you use `*args` instead of just `df`? Is it because you can have multiple input with `*args`? 2) Which part of your function makes the 2nd and subsequent df add to the right of the first one instead of below it? Is it the `'table style="display:inline"'` part? Thanks again – Bowen Liu Sep 14 '18 at 18:52
  • 1) `*args` allows me to take any number of dataframes. Check https://stackoverflow.com/questions/3394835/args-and-kwargs for details... 2) We convert to html and use its notation... (You have to know a bit of html, but try to do `print df1.to_html()` , then save the result to an .html file and load it to a browser...) We then directly change in HTML what is displayed... – ntg Sep 17 '18 at 08:53
  • 2
    Thanks for your great solution! If you want to style your dataframes before displaying them, the input will be `Styler`s, not `DataFrame`s. In this case, use `html_str+=df.render()` instead of `html_str+=df.to_html()`. – Martin Becker Dec 03 '19 at 19:51
  • This is better than the accepted one from my point of view because it doesn't affect other cells. – Max Wong Mar 25 '20 at 17:25
  • 2
    For some reason this does not work in JupyterLab 3.0.11. Maybe because JupyterLab uses a different client-side rendering engine from Jupyter Classic NB? I just tried running the exact same code above in Jupyter Classic NB, which I launched from within JupyterLab's v3.0.11 Help Menu, just to ensure all other variables are the same. It displays perfectly as shown above. I am running iPython 7.25.0 on Python v3.7.10. Interesting! I don't totally understand yet why JupyterLab fails to render the HTML. Anyone know why? – Rich Lysakowski PhD Jul 20 '21 at 05:41
  • 1
    @RichLysakowskiPhD I cannot say why, but this variation without titles works in JupyterLab (v3.1.11 tried): https://newbedev.com/jupyter-notebook-display-two-pandas-tables-side-by-side – Wayne Sep 15 '21 at 19:28
  • 1
    A tweak to center titles over their dataframe: html_str+=f'

    {title}

    '
    – jgreve Jul 21 '22 at 16:14
  • The more up-to-date variant in this other answer works better for me! https://stackoverflow.com/a/68450201/6179774 – Christabella Irwanto May 15 '23 at 09:52
107

You could override the CSS of the output code. It uses flex-direction: column by default. Try changing it to row instead. Here's an example:

import pandas as pd
import numpy as np
from IPython.display import display, HTML

CSS = """
.output {
    flex-direction: row;
}
"""

HTML('<style>{}</style>'.format(CSS))

Jupyter image

You could, of course, customize the CSS further as you wish.

If you wish to target only one cell's output, try using the :nth-child() selector. For example, this code will modify the CSS of the output of only the 5th cell in the notebook:

CSS = """
div.cell:nth-child(5) .output {
    flex-direction: row;
}
"""
zarak
  • 2,933
  • 3
  • 23
  • 29
  • What if I want to give both of them separate title? Tried doing it, was not able to do it – Neeraj Komuravalli Jan 24 '17 at 14:56
  • 7
    This solution affects all the cells, How I can do this for one cell only? – jrovegno Jan 28 '17 at 01:55
  • @NeerajKomuravalli It would probably be best to ask this as a new question. I'm not sure of an easy way to do this off the top of my head. – zarak Jan 28 '17 at 13:50
  • 3
    @jrovegno I updated my answer to include the information you requested. – zarak Jan 28 '17 at 13:52
  • Thanx (+1)! What if I want only the current cell? – ntg Jul 04 '17 at 13:13
  • 2
    @ntg You need to ensure that the line `HTML(''.format(CSS))` is the last line in the cell (and don't forget to use the nth-child selector). However, this may cause issues with the formatting, so your solution is better. (+1) – zarak Jul 06 '17 at 16:13
  • 2
    @zarak Thanx for the kind words :) In your solution, you can have display(HTML(''.format(CSS))) instead of HTML(''.format(CSS)) . Then it can be at any place. I still had the problem with the nth cell though (meaning, if i copy paste, n might change) – ntg Jul 07 '17 at 08:23
  • 4
    `HTML('')` for simplicity sake – Thomas Matthew Jun 20 '18 at 22:25
  • How do you do this (merge two DataFrames) with density plots? – Shounak Ray Apr 04 '19 at 07:37
72

Starting from pandas 0.17.1 the visualization of DataFrames can be directly modified with pandas styling methods

To display two DataFrames side by side you must use set_table_attributes with the argument "style='display:inline'" as suggested in ntg answer. This will return two Styler objects. To display the aligned dataframes just pass their joined HTML representation through the display_html method from IPython.

With this method is also easier to add other styling options. Here's how to add a caption, as requested here:

import numpy as np
import pandas as pd   
from IPython.display import display_html 

df1 = pd.DataFrame(np.arange(12).reshape((3,4)),columns=['A','B','C','D',])
df2 = pd.DataFrame(np.arange(16).reshape((4,4)),columns=['A','B','C','D',])

df1_styler = df1.style.set_table_attributes("style='display:inline'").set_caption('Caption table 1')
df2_styler = df2.style.set_table_attributes("style='display:inline'").set_caption('Caption table 2')

display_html(df1_styler._repr_html_()+df2_styler._repr_html_(), raw=True)

aligned dataframes pandas styler with caption

gibbone
  • 2,300
  • 20
  • 20
  • 1
    Hadn't noticed, that seems quite nice and can probably be helpful in more situations for added e.g. colour etc. (+1) – ntg Mar 16 '21 at 22:43
  • 5
    @gibbone is there a way to specify spacing between the tables? – a11 Aug 15 '21 at 20:15
36

Combining approaches of gibbone (to set styles and captions) and stevi (adding space) I made my version of function, which outputs pandas dataframes as tables side-by-side:

from IPython.core.display import display, HTML

def display_side_by_side(dfs:list, captions:list):
    """Display tables side by side to save vertical space
    Input:
        dfs: list of pandas.DataFrame
        captions: list of table captions
    """
    output = ""
    combined = dict(zip(captions, dfs))
    for caption, df in combined.items():
        output += df.style.set_table_attributes("style='display:inline'").set_caption(caption)._repr_html_()
        output += "\xa0\xa0\xa0"
    display(HTML(output))

Usage:

display_side_by_side([df1, df2, df3], ['caption1', 'caption2', 'caption3'])

Output:

enter image description here

Anton Golubev
  • 1,333
  • 12
  • 21
14

My solution just builds a table in HTML without any CSS hacks and outputs it:

import pandas as pd
from IPython.display import display,HTML

def multi_column_df_display(list_dfs, cols=3):
    html_table = "<table style='width:100%; border:0px'>{content}</table>"
    html_row = "<tr style='border:0px'>{content}</tr>"
    html_cell = "<td style='width:{width}%;vertical-align:top;border:0px'>{{content}}</td>"
    html_cell = html_cell.format(width=100/cols)

    cells = [ html_cell.format(content=df.to_html()) for df in list_dfs ]
    cells += (cols - (len(list_dfs)%cols)) * [html_cell.format(content="")] # pad
    rows = [ html_row.format(content="".join(cells[i:i+cols])) for i in range(0,len(cells),cols)]
    display(HTML(html_table.format(content="".join(rows))))

list_dfs = []
list_dfs.append( pd.DataFrame(2*[{"x":"hello"}]) )
list_dfs.append( pd.DataFrame(2*[{"x":"world"}]) )
multi_column_df_display(2*list_dfs)

Output

David Medenjak
  • 33,993
  • 14
  • 106
  • 134
14

enter image description hereHere's another variation of the display_side_by_side() function introduced by @Anton Golubev that combines gibbone (to set styles and captions) and stevi (adding space), I added an extra argument to change spacing between tables at run-time.

from IPython.core.display import display, HTML

def display_side_by_side(dfs:list, captions:list, tablespacing=5):
    """Display tables side by side to save vertical space
    Input:
        dfs: list of pandas.DataFrame
        captions: list of table captions
    """
    output = ""
    for (caption, df) in zip(captions, dfs):
        output += df.style.set_table_attributes("style='display:inline'").set_caption(caption)._repr_html_()
        output += tablespacing * "\xa0"
    display(HTML(output))
    
display_side_by_side([df1, df2, df3], ['caption1', 'caption2', 'caption3'])

The tablespacing=5 default argument value (shown = 5 here) determines the vertical spacing between tables.

Aristide
  • 3,606
  • 2
  • 30
  • 50
Rich Lysakowski PhD
  • 2,702
  • 31
  • 44
  • Very convenient, thanks. – Aristide Nov 08 '21 at 08:19
  • 1
    Big fan of this one, any idea why the tables refuse to be top aligned in vscode? They look great if they have the same number of rows, but end up center vertical aligned if there are different number of rows. – pwb2103 Oct 30 '22 at 00:53
12

Here is Jake Vanderplas' solution I came across just the other day:

import numpy as np
import pandas as pd

class display(object):
    """Display HTML representation of multiple objects"""
    template = """<div style="float: left; padding: 10px;">
    <p style='font-family:"Courier New", Courier, monospace'>{0}</p>{1}
    </div>"""

    def __init__(self, *args):
        self.args = args

    def _repr_html_(self):
        return '\n'.join(self.template.format(a, eval(a)._repr_html_())
                     for a in self.args)

    def __repr__(self):
       return '\n\n'.join(a + '\n' + repr(eval(a))
                       for a in self.args)

Credit: https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.08-Aggregation-and-Grouping.ipynb

Private
  • 2,626
  • 1
  • 22
  • 39
  • 1
    could you please explain this answer. Jake VanderPlas has not explained it on his website. This is the only solution that prints dataset name on the top. – Gaurav Singhal May 16 '18 at 06:58
  • What do you want to know? – Private May 28 '18 at 19:37
  • May be a description of all the functions/how do they work, how they are called an so on... so that newbie python programmers can understand it properly. – Gaurav Singhal May 29 '18 at 11:21
  • When using Python in an interactive way and it wants to display the end result of a line you entered. it calls the `__repr__()` method, and writes out the string returned. To still work with that, this `display` object has a `__repr__()` method that simply outputs the `repr()` of each of its objects, separated by newlines. To support rendering results in HTML, Jupyter Notebook has a similar method, `_repr_html_()`, that it will prefer and call first (if available). This object defines that method too, with a snippet of HTML that will display the `_repr_html_()` of each object side by side. – Christian Hudon Jan 18 '22 at 17:39
  • +1 This was the only solution that worked for me, but I have slightly changed it to use `**kwargs` instead of using `*args` and `eval`uating the inputs. https://gist.github.com/net-raider/c9986ffa84cbfa106f91be3987953c83 – Net_Raider Oct 14 '22 at 09:40
12

This adds (optional) headers, index and Series support to @nts's answer:

from IPython.display import display_html

def mydisplay(dfs, names=[], index=False):
    def to_df(x):
        if isinstance(x, pd.Series):
            return pd.DataFrame(x)
        else:
            return x
    html_str = ''
    if names:
        html_str += ('<tr>' + 
                     ''.join(f'<td style="text-align:center">{name}</td>' for name in names) + 
                     '</tr>')
    html_str += ('<tr>' + 
                 ''.join(f'<td style="vertical-align:top"> {to_df(df).to_html(index=index)}</td>' 
                         for df in dfs) + 
                 '</tr>')
    html_str = f'<table>{html_str}</table>'
    html_str = html_str.replace('table','table style="display:inline"')
    display_html(html_str, raw=True)

enter image description here

Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
  • This seems very useful, but gives me a problem. For `mydisplay((df1,df2))` only gives `df.to_html(index=False) df.to_html(index=False)` instead of the dataframe contents. Also, there is extra '}' sign at f'string'. –  Nov 23 '18 at 15:19
  • Somewhat unrelated but is it possible to modify your function so that the code for the cell output is hidden? – alpenmilch411 Dec 20 '18 at 00:04
  • 1
    @alpenmilch411 see "Hide Input" extension – Antony Hatchkins Dec 20 '18 at 03:59
  • Any idea how to add a 'max_rows' to this? – Tickon Jan 14 '19 at 14:48
  • This too loses multi indices, when multi indexed data frames are used. – Parthiban Rajendran Sep 21 '20 at 08:05
  • Your code doesn't actually run. I get an error "NameError: name 'bases' is not defined". The function is defined but how do you use it? Could you please improve your answer to explain how it works with a calling example, so your answer runs stand-alone? Thank you. – Rich Lysakowski PhD Jul 20 '21 at 06:10
  • @RichLysakowskiPhD The function `mydisplay` does works pretty well. `bases` is a dictionary where the key is the name of the dataframe, and the value is the dataframe itself. I thought the example is trivial enough to omit the explanation. – Antony Hatchkins Jul 28 '21 at 09:12
8

@zarak code is pretty small but affects the layout of the whole notebook. Other options are a bit messy for me.

I've added some clear CSS to this answer affecting only current cell output. Also you are able to add anything below or above dataframes.

from ipywidgets import widgets, Layout
from IPython import display
import pandas as pd
import numpy as np

# sample data
df1 = pd.DataFrame(np.random.randn(8, 3))
df2 = pd.DataFrame(np.random.randn(8, 3))

# create output widgets
widget1 = widgets.Output()
widget2 = widgets.Output()

# render in output widgets
with widget1:
    display.display(df1.style.set_caption('First dataframe'))
    df1.info()
with widget2:
    display.display(df2.style.set_caption('Second dataframe'))
    df1.info()


# add some CSS styles to distribute free space
box_layout = Layout(display='flex',
                    flex_flow='row',
                    justify_content='space-around',
                    width='auto'
                   )
    
# create Horisontal Box container
hbox = widgets.HBox([widget1, widget2], layout=box_layout)

# render hbox
hbox

enter image description here

MSorro
  • 131
  • 2
  • 6
  • 1
    This is great. I love the option to provide additional metadata about the dataframe. – Rich Lysakowski PhD Aug 19 '21 at 04:50
  • 1
    this is pure genius because it works with matplotlib objects as well: I'm using it to print the pandas table on the left and the plot on the right! – erickfis Jan 31 '22 at 19:42
  • I love this! This answer doesn't require making changes to the content, so you can just pass in your weirdest dataframes as-is – tnwei Aug 08 '23 at 03:18
4

I ended up using HBOX

import ipywidgets as ipyw

def get_html_table(target_df, title):
    df_style = target_df.style.set_table_attributes("style='border:2px solid;font-size:10px;margin:10px'").set_caption(title)
    return df_style._repr_html_()

df_2_html_table = get_html_table(df_2, 'Data from Google Sheet')
df_4_html_table = get_html_table(df_4, 'Data from Jira')
ipyw.HBox((ipyw.HTML(df_2_html_table),ipyw.HTML(df_4_html_table)))
Dinis Cruz
  • 4,161
  • 2
  • 31
  • 49
4

Gibbone's answer worked for me! If you want extra space between the tables go to the code he proposed and add this "\xa0\xa0\xa0" to the following code line.

display_html(df1_styler._repr_html_()+"\xa0\xa0\xa0"+df2_styler._repr_html_(), raw=True)
crystal
  • 103
  • 6
4

I decided to add some extra functionality to Yasin's elegant answer, where one can choose both the number of cols and rows; any extra dfs are then added to the bottom. Additionally one can choose in which order to fill the grid (simply change fill keyword to 'cols' or 'rows' as needed)

import pandas as pd
from IPython.display import display,HTML

def grid_df_display(list_dfs, rows = 2, cols=3, fill = 'cols'):
    html_table = "<table style='width:100%; border:0px'>{content}</table>"
    html_row = "<tr style='border:0px'>{content}</tr>"
    html_cell = "<td style='width:{width}%;vertical-align:top;border:0px'>{{content}}</td>"
    html_cell = html_cell.format(width=100/cols)

    cells = [ html_cell.format(content=df.to_html()) for df in list_dfs[:rows*cols] ]
    cells += cols * [html_cell.format(content="")] # pad

    if fill == 'rows': #fill in rows first (first row: 0,1,2,... col-1)
        grid = [ html_row.format(content="".join(cells[i:i+cols])) for i in range(0,rows*cols,cols)]

    if fill == 'cols': #fill columns first (first column: 0,1,2,..., rows-1)
        grid = [ html_row.format(content="".join(cells[i:rows*cols:rows])) for i in range(0,rows)]

    display(HTML(html_table.format(content="".join(grid))))

    #add extra dfs to bottom
    [display(list_dfs[i]) for i in range(rows*cols,len(list_dfs))]

list_dfs = []
list_dfs.extend((pd.DataFrame(2*[{"x":"hello"}]), 
             pd.DataFrame(2*[{"x":"world"}]), 
             pd.DataFrame(2*[{"x":"gdbye"}])))

grid_df_display(3*list_dfs)

test output

3

Extension of antony's answer If you want to limit de visualization of tables to some numer of blocks by row, use the maxTables variable.enter image description here

def mydisplay(dfs, names=[]):

    count = 0
    maxTables = 6

    if not names:
        names = [x for x in range(len(dfs))]

    html_str = ''
    html_th = ''
    html_td = ''

    for df, name in zip(dfs, names):
        if count <= (maxTables):
            html_th += (''.join(f'<th style="text-align:center">{name}</th>'))
            html_td += (''.join(f'<td style="vertical-align:top"> {df.to_html(index=False)}</td>'))
            count += 1
        else:
            html_str += f'<tr>{html_th}</tr><tr>{html_td}</tr>'
            html_th = f'<th style="text-align:center">{name}</th>'
            html_td = f'<td style="vertical-align:top"> {df.to_html(index=False)}</td>'
            count = 0


    if count != 0:
        html_str += f'<tr>{html_th}</tr><tr>{html_td}</tr>'


    html_str += f'<table>{html_str}</table>'
    html_str = html_str.replace('table','table style="display:inline"')
    display_html(html_str, raw=True)
machnic
  • 2,304
  • 2
  • 17
  • 21
Arzanico
  • 141
  • 1
  • 5