1

I have an issue with axis labels when using groupby and trying to plot with seaborn. Here is my problem:

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns
%matplotlib inline

df = pd.DataFrame({'user': ['Bob', 'Jane','Alice','Bob','Jane','Alice'], 
                   'income': [40000, 50000, 42000,47000,53000,46000]})

groupedProduct = df.groupby(['Product']).sum().reset_index()

I then plot a horizontal bar plot using seaborn:

bar = sns.barplot( x="income", y="user", data=df_group_user, color="b" ) 
#Prettify the plot
bar.set_yticklabels( bar.get_yticks(), size = 10)
bar.set_xticklabels( bar.get_xticks(), size = 10)
bar.set_ylabel("User", fontsize = 20)
bar.set_xlabel("Income ($)", fontsize = 20)
bar.set_title("Total income per user", fontsize = 20)
sns.set_theme(style="whitegrid")
sns.set_color_codes("muted")

Unfortunately, when I run the code in such a manner, the y-axis ticks are labelled as 0,1,2 instead of Bob, Jane, Alice as I'd like it to.

I can get around the issue if I use matplotlib in the following manner:

df_group_user = df.groupby(['user']).sum()

df_group_user['income'].plot(kind="barh")

plt.title("Total income per user")
plt.ylabel("User")
plt.xlabel("Income ($)")

Ideally, I'd like to use seaborn for plotting, but if I don't use reset_index() like above, when calling sns.barplot:

bar = sns.barplot( x="income", y="user", data=df_group_user, color="b" ) 

ValueError: Could not interpret input 'user'
Marc
  • 73
  • 6
  • Hi, many thanks for the reply, I can't share the input data but I reproduced the issue with a simpler dataframe. I am now editing the question such that it's clearer what the problem is. – Marc Jan 15 '22 at 11:05
  • Hi @JohanC, I checked and it's '0.11.1' – Marc Jan 15 '22 at 11:23
  • @Mr.T: I have now updated the question with all the code code I used and a dataframe that can easily be reproduced. Hope that it's okay. – Marc Jan 15 '22 at 11:26
  • 1
    You remove the generated tick labels with `bar.set_yticklabels( bar.get_yticks(), size = 10)` etc. Use instead `bar.tick_params(axis='both', labelsize=10)` to change the font size for the x- and y-axes simultaneously . Problem solved. – Mr. T Jan 15 '22 at 11:40
  • That did the trick, many thanks, @Mr.T! – Marc Jan 15 '22 at 12:36

1 Answers1

-1

just try re-writing the positions of x and y axis.

I'm using a diff dataframe to exhibit similar situation.

gp = df.groupby("Gender")['Salary'].sum().reset_index()
gp

Output:

    Gender  Salary
0   Female  8870
1   Male    23667

Now while plotting a bar chart, mention x axis first and then supply y axis and check,

bar = sns.barplot(x = 'Salary', y = "Gender", data = gp);

enter image description here

  • This is exactly what the OP said they did. – Mr. T Jan 15 '22 at 06:33
  • Hi, Many thanks for the the reply, unfortunately shifting the axis or the order produces the same results. For the rest, it seems that the procedure you're following is exactly the same as mine, though for you it does seem to plot labels as expected instead of just numbers. – Marc Jan 15 '22 at 10:29