Pretty Printing a pandas dataframe

Question

How can I print a pandas dataframe as a nice text-based table, like the following?

+------------+---------+-------------+
| column_one | col_two |   column_3  |
+------------+---------+-------------+
|          0 |  0.0001 | ABCD        |
|          1 |  1e-005 | ABCD        |
|          2 |  1e-006 | long string |
|          3 |  1e-007 | ABCD        |
+------------+---------+-------------+

Romain · Answer 1 · 2022-01-04T21:05:56.140

339

I've just found a great tool for that need, it is called tabulate.

It prints tabular data and works with DataFrame.

from tabulate import tabulate
import pandas as pd

df = pd.DataFrame({'col_two' : [0.0001, 1e-005 , 1e-006, 1e-007],
                   'column_3' : ['ABCD', 'ABCD', 'long string', 'ABCD']})

print(tabulate(df, headers='keys', tablefmt='psql'))

+----+-----------+-------------+
|    |   col_two | column_3    |
|----+-----------+-------------|
|  0 |    0.0001 | ABCD        |
|  1 |    1e-05  | ABCD        |
|  2 |    1e-06  | long string |
|  3 |    1e-07  | ABCD        |
+----+-----------+-------------+

Note:

To suppress row indices for all types of data, pass showindex="never" or showindex=False.

edited Jan 04 '22 at 21:05

answered Aug 07 '15 at 19:30

Romain

19,910
6
56
65

6

If you do not have access to the bleeding edge, you can do `tabulate([list(row) for row in df.values], headers=list(df.columns))` to get rid of the index – Pedro M Duarte Sep 25 '15 at 21:39
2

Doesn't work very well when you have hierarchies in row index and columns. – Siddharth Jan 11 '17 at 06:20
Make sure you do `print(tabulate(df, **kwargs))` and not simply `tabulate(df, **kwargs)`; the latter will show all new lines `\n`.... – Dror Sep 13 '17 at 07:12
7

To suppress the left index column one may want to also add `showindex=False` – Arthur Nov 09 '17 at 15:56
I'd really love for `pandas` to bundle `tabulate` as an optional dependency and allow `df.to_tabular(*args, **kwargs)` – BallpointBen Feb 17 '21 at 03:38

cs95 · Answer 2 · 2023-05-11T18:56:01.687

117

pandas >= 1.0

If you want an inbuilt function to dump your data into some github markdown, you now have one. Take a look at to_markdown:

df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b'])  
print(df.to_markdown()) 
                                               
|    |   A |   B |
|:---|----:|----:|
| a  |   1 |   1 |
| a  |   2 |   2 |
| b  |   3 |   3 |

Here's what that looks like on github:

Note that to_markdown calls tabulate under the hood, so you will still need to have the tabulate package installed. But this means that to_markdown can support 20+ different table formats via keyword arguments that it passes through to tabulate. As an example, you can get the same output as Romain's answer using df.to_markdown(headers='keys', tablefmt='psql')

edited May 11 '23 at 18:56

answered Feb 13 '20 at 07:35

cs95

379,657
97
704
746

1

I used the `to_markdown` to emit markdown from my script, and piped that into `glow -` ([`github`](https://github.com/charmbracelet/glow)) to render the markdown in the terminal with nice results. ([Script here](https://github.com/seanbreckenridge/mint/tree/master/analyze)) – Sean Breckenridge Sep 16 '20 at 15:05
@SeanBreckenridge link is either broken or unaccessible from public. – cs95 Dec 13 '20 at 09:30
Ah, thanks for the ping; was moved to a different folder. Here's a [permalink](https://github.com/seanbreckenridge/mint/blob/576f80fb9e0e0368bf96514b76c1bdb400736fdb/budget/budget/analyze/summary.py#L31-L43) – Sean Breckenridge Dec 13 '20 at 09:51
1

With more arguments passed to `tabulate`, `to_markdown` actually support 20 + types of format (https://github.com/astanin/python-tabulate#table-format) and many other keywords. – Edward Jan 28 '21 at 09:01

score 60 · Answer 3 · answered Apr 15 '19 at 16:32

60

If you are in Jupyter notebook, you could run the following code to interactively display the dataframe in a well formatted table.

This answer builds on the to_html('temp.html') answer above, but instead of creating a file displays the well formatted table directly in the notebook:

from IPython.display import display, HTML

display(HTML(df.to_html()))

Credit for this code due to example at: Show DataFrame as table in iPython Notebook

answered Apr 15 '19 at 16:32

Mark Andersen

868
6
8

3

better than using tabulate. – hzitoun Jun 14 '22 at 17:42
7

For me even just `display(df)` looks good. – Gabriele Jul 13 '22 at 08:38
Marvellous, I really like it in this way. It does not only print nicely, but also shows all columns and rows... great... – Memin Aug 15 '22 at 00:48
Don't know why, but this gives me ``. – akki Jan 16 '23 at 12:13
from IPython.display import display; display(df) works well with huge columns as well. – Dhvani Shah Apr 25 '23 at 04:21
This solution doesn't work for me. But the link you shared for the credits it's what I was looking for. Thank you. – AndreP Jun 28 '23 at 12:56

score 50 · Answer 4 · edited Jan 14 '20 at 01:02

50

A simple approach is to output as html, which pandas does out of the box:

df.to_html('temp.html')

edited Jan 14 '20 at 01:02

Jon-Eric

16,977
9
65
97

answered Aug 11 '18 at 08:16

ErichBSchulz

15,047
5
57
61

1

This is such an underrated response. No additional packages required. And in my case, I wasn't able to get tabulate to print a pivot table with two indexes the way I needed. df.to_html - no problem at all. And if you need multiple dataframes to go into the same html, just do df1.to_html() + df2.to_html() + so on... – Dr Phil Jun 15 '23 at 02:53

score 15 · Answer 5 · answered Aug 30 '13 at 08:43

15

You can use prettytable to render the table as text. The trick is to convert the data_frame to an in-memory csv file and have prettytable read it. Here's the code:

from StringIO import StringIO
import prettytable    

output = StringIO()
data_frame.to_csv(output)
output.seek(0)
pt = prettytable.from_csv(output)
print pt

answered Aug 30 '13 at 08:43

Ofer

2,883
2
17
16

What version of pandas was this? – WAF Jan 29 '15 at 19:15
7

AFAIK, `prettytable` is largely considered abandonware. Shame, too, as it was a nice package. :( – dmn Oct 04 '16 at 15:13
@dmn so it's not maintained anymore? – muon Aug 29 '17 at 01:38
1

`prettytable` has not had a release since Apr 6, 2013. `tabulate` is its spiritual predecessor and has regular releases, the most recent being on Jan 24, 2019. – noddy Feb 05 '19 at 15:11
4

`prettytable` has been resurrected under maintainership of jazzband! Hurray! https://github.com/jazzband/prettytable – Nick Crews Nov 20 '21 at 01:22

score 14 · Answer 6 · answered Jul 11 '19 at 15:15

14

Following up on Mark's answer, if you're not using Jupyter for some reason, e.g. you want to do some quick testing on the console, you can use the DataFrame.to_string method, which works from -- at least -- Pandas 0.12 (2014) onwards.

import pandas as pd

matrix = [(1, 23, 45), (789, 1, 23), (45, 678, 90)]
df = pd.DataFrame(matrix, columns=list('abc'))
print(df.to_string())

#  outputs:
#       a    b   c
#  0    1   23  45
#  1  789    1  23
#  2   45  678  90

answered Jul 11 '19 at 15:15

sigint

1,842
1
22
27

`DataFrame.to_string` official docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_string.html#pandas.DataFrame.to_string – Parth Nov 05 '20 at 05:07
3

You don't need the `.to_string()` method if you don't process the data, `print(df)` does the same – Spartan Jun 24 '22 at 18:29
1

This should be the top-rated & accepted answer. Out of the box & easy to use (and most importantly, it works!). – akki Jan 16 '23 at 12:13
@Spartan Your comment only holds true if the column size is small. Otherwise "print(df)" replaces the middle columns by dots and prints the last columns again. – Jakob Jan 16 '23 at 16:28
3

@Jakob: True, after all the question was about pretty printing, but you can change how much columns are shown, either globally with `pd.set_option('display.max_columns', None)` or i.e. `with pd.option_context(‘display.max_columns’, None): print(df)` – Spartan Jan 17 '23 at 15:07

score 8 · Answer 7 · answered Jun 06 '14 at 10:36

I used Ofer's answer for a while and found it great in most cases. Unfortunately, due to inconsistencies between pandas's to_csv and prettytable's from_csv, I had to use prettytable in a different way.

One failure case is a dataframe containing commas:

pd.DataFrame({'A': [1, 2], 'B': ['a,', 'b']})

Prettytable raises an error of the form:

Error: Could not determine delimiter

The following function handles this case:

def format_for_print(df):    
    table = PrettyTable([''] + list(df.columns))
    for row in df.itertuples():
        table.add_row(row)
    return str(table)

If you don't care about the index, use:

def format_for_print2(df):    
    table = PrettyTable(list(df.columns))
    for row in df.itertuples():
        table.add_row(row[1:])
    return str(table)

Hi, the `format_for_print()` function does not seem to be printing the index of the Pandas DataFrame. I set the index using `df.index.name = 'index'` but this does not print the index column with a name. — edesz, Mar 12 '15 at 21:21

score 8 · Answer 8 · answered May 27 '20 at 15:08

Maybe you're looking for something like this:

def tableize(df):
    if not isinstance(df, pd.DataFrame):
        return
    df_columns = df.columns.tolist() 
    max_len_in_lst = lambda lst: len(sorted(lst, reverse=True, key=len)[0])
    align_center = lambda st, sz: "{0}{1}{0}".format(" "*(1+(sz-len(st))//2), st)[:sz] if len(st) < sz else st
    align_right = lambda st, sz: "{0}{1} ".format(" "*(sz-len(st)-1), st) if len(st) < sz else st
    max_col_len = max_len_in_lst(df_columns)
    max_val_len_for_col = dict([(col, max_len_in_lst(df.iloc[:,idx].astype('str'))) for idx, col in enumerate(df_columns)])
    col_sizes = dict([(col, 2 + max(max_val_len_for_col.get(col, 0), max_col_len)) for col in df_columns])
    build_hline = lambda row: '+'.join(['-' * col_sizes[col] for col in row]).join(['+', '+'])
    build_data = lambda row, align: "|".join([align(str(val), col_sizes[df_columns[idx]]) for idx, val in enumerate(row)]).join(['|', '|'])
    hline = build_hline(df_columns)
    out = [hline, build_data(df_columns, align_center), hline]
    for _, row in df.iterrows():
        out.append(build_data(row.tolist(), align_right))
    out.append(hline)
    return "\n".join(out)


df = pd.DataFrame([[1, 2, 3], [11111, 22, 333]], columns=['a', 'b', 'c'])
print tableize(df)

Output:
+-------+----+-----+
|    a  |  b |   c |
+-------+----+-----+
|     1 |  2 |   3 |
| 11111 | 22 | 333 |
+-------+----+-----+

The best answer, could you explain why it works so well tho? — Farhan Hai Khan, Aug 05 '22 at 06:37
This is so helpful when printing my job analytics summary tables for serverless Dataproc jobs. I didn't want to use custom containers just to add the `tabulate` library. You're a lifesaver! — L Co, Sep 27 '22 at 14:26
This is an amazing answer! No dependencies needed for something "simple" like this, but I also don't want to code it up myself. The tables are beautiful, too. — John Haberstroh, Dec 21 '22 at 21:51
I like this answer so much I added a max_col_width argument which should be helpful for tables with randomly long values. — John Haberstroh, Dec 21 '22 at 22:02

score 3 · Answer 9 · answered Jun 24 '22 at 18:18

3

I use the rich library for that, it has nicer looking tables than the tabulate based .to_markdown().

import pandas as pd
from rich.console import Console
from rich.table import Table
df = pd.DataFrame({'col_two' : [0.0001, 1e-005 , 1e-006, 1e-007],
                   'column_3' : ['ABCD', 'ABCD', 'long string', 'ABCD']})
console = Console()
table = Table('Title')
table.add_row(df.to_string(float_format=lambda _: '{:.4f}'.format(_)))
console.print(table)

Gives you this table:

See the documentation for more customization options:

https://rich.readthedocs.io/en/stable/tables.html

answered Jun 24 '22 at 18:18

Spartan

664
9
12

1

Along the lines of this approach, there's [rich-dataframe](https://pypi.org/project/rich-dataframe/). – Wayne Sep 30 '22 at 19:40
I just made a fork of rich-dataframe code that can be easily placed within other code **AND** where I changed it so the caption only is shown if you go over the threshold for number of rows or columns and I removed the animated aspect so it doesn't cause weird spacings if you want to use it in Jupyter. Even if you don't want quite those two customizations, seeing how I did them may help you customizing as you want. See it in my fork [here](https://github.com/fomightez/rich-dataframe/blob/master/rich_dataframe/rich_dataframe.py). – Wayne Sep 30 '22 at 20:52

Brooks Christensen · Answer 10 · 2022-04-26T17:43:41.077

2

Update: an even better solution is to simply put the variable name of the dataframe on the last line of the cell. It will automatically print in a pretty format.

import pandas as pd
import numpy as np

df = pd.DataFrame({'Data1': np.linspace(0,10,11), 'Data2': np.linspace(10,0,11)})
df

edited Apr 26 '22 at 17:43

answered Mar 23 '22 at 15:19

Brooks Christensen

21
2

this might not be what the question is asking. leaving the variable at the end does not execute a 'print' function, i.e. the result does not persist and can easily be overwritten by another object call. – Bao Le Jun 22 '23 at 08:34

Pretty Printing a pandas dataframe

10 Answers10

pandas >= 1.0

Linked