How to use Polars with Plotly without converting to Pandas?

Question

I would like to replace Pandas with Polars but I was not able to find out how to use Polars with Plotly without converting to Pandas. I wonder if there is a way to completely cut Pandas out of the process.

Consider the following test data:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

fig = px.bar(df, x='names', y='random')
fig.show()

I would like this code to show the bar chart in a Jupyter notebook but instead it returns an error:

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/polars/internals/frame.py:1483: UserWarning: accessing series as Attribute of a DataFrame is deprecated
  warnings.warn("accessing series as Attribute of a DataFrame is deprecated")

It is possible to transform the Polars data frame to a Pandas data frame with df = df.to_pandas(). Then, it works. However, is there another, simpler and more elegant solution?

Wayne · Accepted Answer · 2023-05-29T04:05:18.197

Yes, no need for converting to a Pandas dataframe. Someone (sa-) has requested supporting a better option here and included a workaround for it.

"The workaround that I use right now is px.line(x=df["a"], y=df["b"]), but it gets unwieldy if the name of the data frame is too big"

For the OP's code example, the approach of specifying the dataframe columns explicitly works.
I find in addition to specifying the dataframe columns with px.bar(x=df["names"], y=df["random"]) - or - px.bar(df, x=df["names"], y=df["random"]), casting to a list can also work:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

px.bar(df, x=list(df["names"]), y=list(df["random"]))

Knowing polars better, you may see some other options once you see the idea of the workaround.

The example posted there is simpler, instead of px.line(df, x="a", y="b") like you could use for a Pandas dataframe, you use px.line(x=df["a"], y=df["b"]). With polars, that is:

import polars as pl
import plotly.express as px

df = pl.DataFrame({"a":[1,2,3,4,5], "b":[1,4,9,16,25]})

px.line(x=df["a"], y=df["b"])

(Note that using plotly.express requires Pandas to be installed, see here and here. I used plotly.express in my answer because it was closer to the OP. The code could be adapted to using plotly.graph_objects if there was a desire to not have Pandas installed & involved at all.)

This is exactly the elegant solution I was hoping for. Thanks! — fabioklr, Apr 05 '22 at 09:02
ImportError: Plotly express requires pandas to be installed. — ScipioAfricanus, May 28 '23 at 15:54
True, that seems to be the case @ScipioAfricanus . I reworded my first line, and I'll add a reference to that to the end. The main point stands that you don't need to convert to a Pandas dataframe. — Wayne, May 29 '23 at 04:06

score 2 · Answer 2 · answered Jun 24 '23 at 16:39

Currently making the switch to pola.rs from pandas. From my research your [] will work but is considered an anti-pattern in polars. This author suggests that you use the .to_series method.

px.pie(df,                                   # Polars DataFrame
   names = df.select('Model').to_series(),
   values = df.select('Sales').to_series(), 
   hover_name = df.select('Model').to_series(),
   color_discrete_sequence= px.colors.sequential.Plasma_r)

https://towardsdatascience.com/visualizing-polars-dataframes-using-plotly-express-8da4357d2ee0

When it comes to visualization of polar dataframe it seems you can't totally be rid of pandas dataframe conversion.

Hope this helped

How to use Polars with Plotly without converting to Pandas?

2 Answers2