2

I want to change the hover text and hover data for a python plotly boxplot. Instead of 5 separate hover boxes for max, q3, median, q1, and min, I want one condensed hover box for Median, Mean, IQR and date. I have played around with every "hover" variable with no luck. My sample code is found below.

import numpy as np
import plotly.express as px

lst = [['2020'], ['2021']] 
numbers = [20 , 25]
r = [x for i, j in zip(lst, numbers) for x in i*j]

df = pd.DataFrame(r, columns=['year'])
df['obs'] = np.arange(1,len(df)+1) * np.random.random()

mean = df.groupby('year').mean()[['obs']]
median = df.groupby('year').median()[['obs']]
iqr = df.groupby('year').quantile(0.75)[['obs']] - df.groupby('year').quantile(0.25)[['obs']]

stats = pd.concat([mean,median,iqr], axis=1)
stats.columns = ['Mean','Median','IQR']
tot_df = pd.merge(df,stats, right_index=True, left_on='year', how = 'left')

fig = px.box(tot_df, x="year", y="obs", points=False, hover_data=['year','Mean','Median','IQR'])
fig.show()

enter image description here

In this case I tried to use "hover_data", which does not raise an error, but also does not change the plot, as shown above. I have tried both express and graph_objects with no luck. My plotly versions is 4.9.0. Thank you!

njalex22
  • 367
  • 1
  • 4
  • 13

1 Answers1

3
  • have used technique of overlaying a bar trace over boxplot trace
  • bar trace can be configured to show information you want
  • for sake of demonstration, I have set opacity to 0.05 it can be set to 0 to make it fully invisible
  • have built this against plotly 5.2.1, have not tested against 4.9.0
import numpy as np
import plotly.express as px
import pandas as pd

lst = [['2020'], ['2021']] 
numbers = [20 , 25]
r = [x for i, j in zip(lst, numbers) for x in i*j]

df = pd.DataFrame(r, columns=['year'])
df['obs'] = np.arange(1,len(df)+1) * np.random.random()

mean = df.groupby('year').mean()[['obs']]
median = df.groupby('year').median()[['obs']]
iqr = df.groupby('year').quantile(0.75)[['obs']] - df.groupby('year').quantile(0.25)[['obs']]

stats = pd.concat([mean,median,iqr], axis=1)
stats.columns = ['Mean','Median','IQR']
tot_df = pd.merge(df,stats, right_index=True, left_on='year', how = 'left')

fig = px.box(tot_df, x="year", y="obs", points=False)

fig2 = px.bar(
    tot_df.groupby("year", as_index=False)
    .agg(base=("obs", "min"), bar=("obs", lambda s: s.max() - s.min()))
    .merge(
        tot_df.groupby("year", as_index=False).agg(
            {c: "first" for c in tot_df.columns if c not in ["year", "obs"]}
        ),
        on="year",
    ),
    x="year",
    y="bar",
    base="base",
    hover_data={
        **{c: True for c in tot_df.columns if c not in ["year", "obs"]},
        **{"base": False, "bar": False},
    },
).update_traces(opacity=0.05)

fig.add_traces(fig2.data)

enter image description here

fig2 without named aggregations

fig2 = px.bar(
    tot_df.groupby("year", as_index=False)["obs"]
    .apply(lambda s: pd.Series({"base": s.min(), "bar": s.max() - s.min()}))
    .merge(
        tot_df.groupby("year", as_index=False).agg(
            {c: "first" for c in tot_df.columns if c not in ["year", "obs"]}
        ),
        on="year",
    ),
    x="year",
    y="bar",
    base="base",
    hover_data={
        **{c: True for c in tot_df.columns if c not in ["year", "obs"]},
        **{"base": False, "bar": False},
    },
).update_traces(opacity=0.05)

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • My pandas version is 0.24.2 and therefore, this implementation of named aggregation for groupby does not work. Given, I am not exactly sure what is being constructed for the first argument of px.bar, can you advise how to change this groupby statement to be compatible with pandas 0.24.2? Thank you. – njalex22 Aug 25 '21 at 13:25
  • that's an ancient version of pandas, named aggregations https://pandas-docs.github.io/pandas-docs-travis/whatsnew/v0.25.0.html were introduced in 0.25. I can't downgrade to 0.24.2 as wheels won't build on my env. Are you able to upgrade pandas to a version that's less than 2 years old? – Rob Raymond Aug 25 '21 at 13:38
  • have updated answer with another way of building fig2 without named aggregations. I can't validate if this works against v. old version of pandas – Rob Raymond Aug 25 '21 at 13:51