Pandas make a column clickable and sort

Question

I followed this post How to create a table with clickable hyperlink in pandas & Jupyter Notebook to create a clickable link in a dataframe. However, it seems any time I sort the dataframe, the hyper links go away.

df = pd.DataFrame(['http://google.com', 'http://duckduckgo.com'], columns=["a"])

def make_clickable(val):
    return '<a href="{}">{}</a>'.format(val,val)

df.style.format(make_clickable)

This will display unclickable links:

df.sort_values(by="a")

`df.style.format(make_clickable)` does output the correct (clickable) links on my notebook. Maybe you've missed capturing the output? `df = df.style.format(make_clickable)`? — cglacet, Jun 16 '20 at 18:32

score 0 · Answer 1 · answered Jun 16 '20 at 21:49

I've never used styling before so I might be wrong. According to what I can find in the Styling documentation styles are not meant for defining a default HTML representation of a DataFrame. The expected way to work with styles is to call DataFrame.style.format() every time you need the given formatting options to be applied, in your case that would be:

df.sort_values(by='a').style.format({'a': make_clickable})

From your question I guess you would like to force a given column to always display in a given way when you simply have a DataFrame as the last line of a cell. Here are two solutions that you may try.

Solution 1

We can define a shortcut for this which could especially be handy if we need to have a more complex formatting strategy:

def clickable_links(df):
    return df.style.format({'a': make_clickable})

Then simply:

clickable_links(df.sort_values(by='a'))

Solution 2

Another viable solution (just for fun) would be automatically have all links (columns with name 'a') clickable for all DataFrames:

def format_all_html_repr(format_options):
    pd.DataFrame._repr_html_ = lambda self: self.style.format(format_options).render()

format_all_html_repr({'a': make_clickable})

df = pd.DataFrame(['http://google.com', 'http://duckduckgo.com'], columns=['a'])

Then we don't need to be as explicit as we were in solution 1:

df.sort_values(by='a')

The drawback of this second solution is that every DataFrame will now format a columns as links in their HTML output.

Details on solution 2

The idea of this solution is to modify how HTML is rendered by default. In your notebook, everytime a cell ends with a DataFrame df it automatically calls df._repr_html_. We can use that to modify the default behaviour. One way is to simply bind the function pandas.DataFrame._repr_html_ to the function we like.

You can't really define this behaviour for a single DataFrame by binding the method df._repr_html_ directly for a given instance because this modification wouldn't be passed to descendent DataFrames. In other words df._repr_html would be different from df.sort_values(by='a')._repr_html because df and df.sort_values(by='a') are two distinct instances of DataFrame. They don't share their methods.

Maybe a middle ground could be found by copying the method when slicing/sorting/or applying any transformation to df, but that would probably be a bit more complex to write.

Pandas make a column clickable and sort

1 Answers1

Solution 1

Solution 2

Details on solution 2