17

I have a huge pandas dataframe I am converting to html table i.e. dataframe.to_html(), its about 1000 rows. Any easy way to use pagination so that I dont have to scroll the whole 1000 rows. Say, view the first 50 rows then click next to see subsequent 50 rows?

DougKruger
  • 4,424
  • 14
  • 41
  • 62
  • That's an intersesting question indeed! If the "pagination" can be implemented using CSS classes, you can try to use [Style](http://pandas.pydata.org/pandas-docs/stable/style.html) conditionally (i.e. rows 0-49 - Style: page1, 50-99 - Style: page2, etc.). – MaxU - stand with Ukraine Aug 11 '16 at 19:25
  • Are you trying to view it within a Jupyter notebook, or as an independent HTML file? – Shovalt Mar 06 '18 at 14:39

2 Answers2

14

Update 2022

It seems that there is now a simple and modern solution, using itables.

Installation:

pip install itables

Basic usage (from the GitHub readme):

from itables import show

show(df)

Result: itables result

There is also a command for displaying all tables in the notebook like this by default.

Original answer (exporting table to HTML file)

The best solution I can think of involves a couple of external JS libraries: JQuery and its DataTables plugin. This will allow for much more than pagination, with very little effort.

Let's set up some HTML, JS and python:

from tempfile import NamedTemporaryFile
import webbrowser

base_html = """
<!doctype html>
<html><head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.2/jquery.min.js"></script>
<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/1.10.16/css/jquery.dataTables.css">
<script type="text/javascript" src="https://cdn.datatables.net/1.10.16/js/jquery.dataTables.js"></script>
</head><body>%s<script type="text/javascript">$(document).ready(function(){$('table').DataTable({
    "pageLength": 50
});});</script>
</body></html>
"""

def df_html(df):
    """HTML table with pagination and other goodies"""
    df_html = df.to_html()
    return base_html % df_html

def df_window(df):
    """Open dataframe in browser window using a temporary file"""
    with NamedTemporaryFile(delete=False, suffix='.html') as f:
        f.write(df_html(df))
    webbrowser.open(f.name)

And now we can load a sample dataset to test it:

from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)

df_window(df)

The beautiful result: enter image description here

A few notes:

  • Notice the pageLength parameter in the base_html string. This is where I defined the default number of rows per page. You can find other optional parameters in the DataTable options page.
  • The df_window function was tested in a Jupyter Notebook, but should work in plain python as well.
  • You can skip df_window and simply write the returned value from df_html into an HTML file.

Edit: how to make this work with a remote session (e.g. Colab)

When working on a remote notebook, like in Colab or Kaggle the temporary file approach won't work, since the file is saved on the remote machine and not accessible by your browser. A workaround for that would be to download the constructed HTML and open it locally (adding to the previous code):

import base64
from IPython.core.display import display, HTML

my_html = df_html(df)
my_html_base64 = base64.b64encode(my_html.encode()).decode('utf-8')
display(HTML(f'<a download href="data:text/html;base64,{my_html_base64}" target="_blank">Download HTML</a>'))

This will result in a link containing the entire HTML encoded as a base64 string. Clicking it will download the HTML file and you can then open it directly and view the table.

Shovalt
  • 6,407
  • 2
  • 36
  • 51
  • I get error in my notebook while using it. `TypeError: a bytes-like object is required, not 'str'`. Do you have any idea? – Ronak Shah Jan 17 '19 at 10:11
  • @RonakShah, I assume you are using python3. Try adding `mode='w+'` to the `NamedTemporaryFile` parameters and let me know if it works. – Shovalt Jan 17 '19 at 10:28
  • There seems to be some issue from my end which I need to figure out. The code given by you works fine. Thank you for your help :) – Ronak Shah Jan 18 '19 at 08:17
  • I have a question. When I tried to run in google collab I can.t do it. is there any suggestion to get in collab?. Thanks in advance. – GSandro_Strongs Apr 10 '21 at 16:03
  • 1
    @GSandro_Strongs I've edited the answer with a solution for you. Let me know how it works. – Shovalt Apr 14 '21 at 06:16
  • @RonakShah thanks for your help, I add this line display(HTML(df_html(df))) and I got the table in the google colab with no necesarry to download. – GSandro_Strongs Apr 15 '21 at 16:57
  • Not working for me. It's returning plain table/df when tried on Jupyter. I don't know much about JS and JQuery. Do I need to install anything on my venv to make this work? – Darshan Jun 25 '21 at 11:54
  • @Darshan - no need to install anything in normal environments. Are you trying the first (local) or second (colab) solution? Any way you can share your code? – Shovalt Jun 30 '21 at 05:42
  • 1
    @Shovalt very good answer you posted. Thank you. I was inspired by your answer and (*in my case*), I concatenated the `base_html` to the HTML of the dataframe - `df.to_html()`. I save this result as a file, but also, I did use the result as a local variable called `total_html` and call `display(HTML(total_html))` for display the result. Frankly, I don't know if my approach is a good practice, but, it worked in Google Colab. – Marco Aurelio Fernandez Reyes Jul 15 '22 at 15:43
  • 1
    Also, I work on Google Colab with the dark theme enabled, so, the DataTable plugin styles looks a little *odd*- just, if anyone want to try it, you can just customize the CSS styles of the DataTable or leave the Google Colab with the *light* theme - [here is a visual example with Colab's light theme](https://i.stack.imgur.com/LQSSi.png) - and [Colab with dark mode](https://i.stack.imgur.com/sN1zn.png). – Marco Aurelio Fernandez Reyes Jul 15 '22 at 15:55
1

I developed a solution for this out of necessity: paginate_pandas, a much simpler package than itables, leveraging on ipywidgets.

With some obvious bias, I feel itables might be a bit overkill. I can already filter and sort with pandas when I'm in Jupyter, so the only thing I need is pagination. paginate_pandas gives you that with a nice slider:

paginate pandas example screenshot

jmberros
  • 51
  • 1
  • 5