1

I would like to change labels to bar plots from groupby, possibly by mapping a label to classes instead of changing values in the dataset.

In matplotlib it is possible to style the classes : https://www.pythoncharts.com/matplotlib/grouped-bar-charts-matplotlib/

can I do something similar in pandas ?

What I mean is this:

dt = pd.DataFrame(np.random.randint(0,3,size=(10, 3)), columns=list('ABC'))

# I create a multi_index object and plot
dt.groupby('A')['B'].value_counts().unstack(0).plot(kind='bar')

enter image description here

See the legend of the classes under B ?

Instead of 0, 1, 2, I would like to pass names, like 'good', 'bad', average'.

I was trying to look at ways to map a function or dictionary as labels, but not working.

e.g.

dt.groupby('A')['B'].value_counts().unstack().plot(kind='bar', label = {0:'a', 1:'2', 2:'d'})

does nothing.

Alternatively I was trying to change values of the index in a multi_index, but it is cumbersome and could not find a way to do it without hitting errors (I tried with loc, apply, reset_index - could not find a way).

Eventually I would set on matplotlib, but for my purpose, a oneliner would be ideal.

Just would like to adjust those labels on classes.

Can you show a synthetic way ?

Is it a good practice to alter values in a df ? I would prefer a mapping function on plotting level.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
user305883
  • 1,635
  • 2
  • 24
  • 48

1 Answers1

0

You can just replace the data before plot:

# crosstab is a bit slower than groupby().value_counts().unstack()
# but it's more concise!
pd.crosstab(dt['A'], dt['B'].replace({0:'good', 1:'OK', 2:'bad'})).plot.bar()

Output:

enter image description here

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • but replace() will change values on all df, and if value is missing will put nan. I only want to change labels, or change the value of the multi index (B) – user305883 Sep 07 '22 at 20:03
  • replace changes the value on a *copy* of your data before rendering. It doesn't change your actual data. – Quang Hoang Sep 07 '22 at 20:54
  • I mean replace () will change values on all columns of all the df, while I only want to change the one for one column or multiindex (B) – user305883 Sep 08 '22 at 06:17