6

I have two queries:

  1. I want to remove the empty bar from the bar graph (present in the first column).
  2. I have to use this graph in a PowerPoint presentation. How can I increase the height of the bar graph such that it fixes the height of the slide? I have tried to increase the height but it is not increasing any further. Is it possible? If not what are other options that I can try?

enter image description here

plt.figure(figsize=(40,20))
    g = sns.catplot(x = 'Subject', y = 'EE Score',data = df , hue = 'Session',col='Grade',sharey = True,sharex = True,
                    hue_order=["2017-18", "2018-19", "2019-20"], kind="bar");
    #plt.legend(bbox_to_anchor=(1, 1), loc=2) 
    g.set(ylim=(0, 100))
    g.set_axis_labels("Subject", "EE Score")
    
    ax = g.facet_axis(0,0)
    for p in ax.patches:
        ax.text(p.get_x() + 0.015, 
                p.get_height() * 1.02, 
                '{0:.1f}'.format(p.get_height()), 
                color='black', rotation='horizontal', size=12)
    ax = g.facet_axis(0,1)
    for p in ax.patches:
        ax.text(p.get_x() + 0.015, 
                p.get_height() * 1.02, 
                '{0:.1f}'.format(p.get_height()), 
                color='black', rotation='horizontal', size=12)
    ax = g.facet_axis(0,2)
    for p in ax.patches:
        ax.text(p.get_x() + 0.015, 
                p.get_height() * 1.02, 
                '{0:.1f}'.format(p.get_height()), 
                color='black', rotation='horizontal', size=12)
    ax = g.facet_axis(0,3)
    for p in ax.patches:
        ax.text(p.get_x() + 0.015, 
                p.get_height() * 1.02, 
                '{0:.1f}'.format(p.get_height()), 
                color='black', rotation='horizontal', size=12)
        
    #g.set_ylabel('')
    plt.savefig('2.png', bbox_inches = 'tight')
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Asra Khalid
  • 177
  • 1
  • 18

1 Answers1

8
  • Like @JohanC, I initially thought it was not possible to remove an empty category from a catplot(). However, Michael's comment provides the solution: sharex=False.
  • This solution will not work if the column used for x= is a category dtype, which can be checked with pandas.DataFrame.info()
  • Tested in python 3.10, pandas 1.4.2, matplotlib 3.5.1, seaborn 0.11.2
    • seaborn is a high-level api for matplotlib

object dtype x-axis

import seaborn as sns

titanic = sns.load_dataset('titanic')

# remove one category
titanic.drop(titanic.loc[(titanic['class']=='First')&(titanic['who']=='child')].index, inplace=True)

g = sns.catplot(x="who", y="survived", col="class", data=titanic, kind="bar", ci=None, sharex=False, hue='embarked', estimator=sum)

enter image description here

categorical dtype x-axis

  • See that tips.day is categorical and sharex=False will not work
  • The column can be converted to object dtype with tips.day = tips.day.astype('str'), in which case, sharex=False will work, but the days of the week will not be ordered.
tips = sns.load_dataset('tips')

print(tips.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   total_bill  244 non-null    float64 
 1   tip         244 non-null    float64 
 2   sex         244 non-null    category
 3   smoker      244 non-null    category
 4   day         244 non-null    category
 5   time        244 non-null    category
 6   size        244 non-null    int64   
dtypes: category(4), float64(2), int64(1)
memory usage: 7.4 KB

g = sns.catplot(x="day", y="total_bill", col="time", kind="bar", data=tips, ci=None, sharex=False, hue='smoker')

enter image description here

  • With converting the column to a object dtype
  • Note the days are no longer ordered.
tips.day = tips.day.astype('str')

print(tips.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   total_bill  244 non-null    float64 
 1   tip         244 non-null    float64 
 2   sex         244 non-null    category
 3   smoker      244 non-null    category
 4   day         244 non-null    object  
 5   time        244 non-null    category
 6   size        244 non-null    int64   
dtypes: category(3), float64(2), int64(1), object(1)
memory usage: 8.8+ KB

g = sns.catplot(x="day", y="total_bill", col="time", kind="bar", data=tips, ci=None, sharex=False, hue='smoker')

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
  • 3
    Perhaps worth doing `data=titanic.sort_values("who")` so that the bars appear in the same order within each facet. You could also use a dictionary argument for `palette` to have multiple colors with a consistent level -> color mapping. – mwaskom Jan 16 '21 at 18:11