18

I'm trying to create bar chart using seaborn.factorplot. My code looks like this:

import seaborn as sns
import matplotlib.pyplot as plt 
 
df = pd.read_csv('data.csv')
 
fg = sns.factorplot(x='vesselID', y='dur_min', hue='route', size=6, aspect=2, kind='bar', data=df)

my data.csv looks like this

,route,vesselID,dur_min
0,ANA-SJ,13,39.357894736842105
1,ANA-SJ,20,24.747663551401867
2,ANA-SJ,38,33.72142857142857
3,ANA-SJ,69,37.064516129032256
4,ED-KING,30,22.10062893081761
5,ED-KING,36,21.821428571428573
6,ED-KING,68,23.396551724137932
7,F-V-S,1,13.623239436619718
8,F-V-S,28,14.31294964028777
9,F-V-S,33,16.161616161616163
10,MUK-CL,18,13.953191489361702
11,MUK-CL,19,14.306513409961687
12,PD-TAL,65,12.477272727272727
13,PT-COU,52,27.48148148148148
14,PT-COU,66,28.24778761061947
15,SEA-BI,25,30.94267515923567
16,SEA-BI,32,31.0
17,SEA-BI,37,31.513513513513512
18,SEA-BR,2,55.8
19,SEA-BR,13,57.0
20,SEA-BR,15,54.05434782608695
21,SEA-BR,17,50.43859649122807

please click here to see the output

Now my question is how to change the width of the bar and I'm not able to achieve this by changing size and aspect.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
kaleshanagineni
  • 325
  • 1
  • 4
  • 13

8 Answers8

47

In my case, I didn't have to define a custom function to change the width as suggested above (which btw didn't work for me as all the bars were unaligned). I simply added the attribute dodge=False to the argument of the seaborn plotting function and this made the trick! e.g.

sns.countplot(x='x', hue='y', data=data, dodge=False);

See additional reference here: https://github.com/mwaskom/seaborn/issues/871

My bar plot looks now like this:

enter image description here

tsando
  • 4,557
  • 2
  • 33
  • 35
  • While this has been upvoted, it does not actually answer the question in the OP, because `'vesselID' == 13` has two routes, and using `dodge=False` results in a `'SEA-BR'` being plotted on top of `'ANA-SJ'`. [code and plot](https://i.stack.imgur.com/8y6IS.png). Which means `dodge=False` is not a good option if the x-axis category has more than a single `hue` group. – Trenton McKinney Jun 02 '23 at 03:11
38

In fact, you can do it using directly the patches attributes with the function set_width. However if you only do that, you will just modify your patches width but not the position on the axe, so you have to change the x coordinates too.

import pylab as plt
import seaborn as sns

tips = sns.load_dataset("tips")
fig, ax = plt.subplots()

sns.barplot(data=tips, ax=ax, x="time", y="tip", hue="sex")

def change_width(ax, new_value) :
    for patch in ax.patches :
        current_width = patch.get_width()
        diff = current_width - new_value

        # we change the bar width
        patch.set_width(new_value)

        # we recenter the bar
        patch.set_x(patch.get_x() + diff * .5)

change_width(ax, .35)
plt.show()

And here is the result : barplot result

jsgounot
  • 709
  • 6
  • 12
1

I don't think seaborn will do this, but it's possible mwaskom will come verify.

First, the general way to tweak matplotlib calls in seaborn is to pass through more kwargs (or in some cases a dict thereof), which would change your code like this:

fg = seaborn.factorplot(x='vesselID', y='dur_min', hue='route',
                        size=6,  aspect=2,
                        kind='bar', 
                        width=10, # Factorplot passes arguments through
                        data=df)

but when I run that the error is:

TypeError: bar() got multiple values for keyword argument 'width'

and, yes, it turns out all the seaborn categorical comparisons define width and build a lot of the aesthetics around it. You can check the draw_bars function in categorical.py directly, and of course you could edit your own copy of categorical.py, but that part of seaborn's style is currently baked in.

cphlewis
  • 15,759
  • 4
  • 46
  • 55
1
  • sns.factorplot has been renamed to sns.catplot
  • The accepted answer doesn't work because it produces this plot, with many overlapping bars.
  • This highly upvoted answer, with dodge=False, does not work because 'vesselID' == 13 has two routes, which results in 'SEA-BR' being plotted on top of 'ANA-SJ', and results in this plot.
    • If dodge=False is used when a category on the x-axis has more than a single hue category, bars will be plotted on top of each other.
    • hue should not be used to doubly encode the same categories that appear on the x-axis (e.g. x='route' and hue='route', for example):
      1. seaborn already colors the bars, as seen in this plot.
      2. The color encoding has no meaning, as discussed in this answer. If the color encoding doesn't have meaning, the bars should be a single color, as shown in this plot.
  • The best option for this case, is to plot the bars against unique values, like df.index if it's a RangeIndex (e.g. range(0, 22)), use dodge=False, and then use set_xticklabels to set the correct labels.
import pandas as pd
import seaborn as sns

# DataFrame with data from the OP
data = {'route': ['ANA-SJ', 'ANA-SJ', 'ANA-SJ', 'ANA-SJ', 'ED-KING', 'ED-KING', 'ED-KING', 'F-V-S', 'F-V-S', 'F-V-S', 'MUK-CL', 'MUK-CL', 'PD-TAL', 'PT-COU', 'PT-COU', 'SEA-BI', 'SEA-BI', 'SEA-BI', 'SEA-BR', 'SEA-BR', 'SEA-BR', 'SEA-BR'],
        'vesselID': [13, 20, 38, 69, 30, 36, 68, 1, 28, 33, 18, 19, 65, 52, 66, 25, 32, 37, 2, 13, 15, 17],
        'dur_min': [39.357894736842105, 24.747663551401867, 33.72142857142857, 37.064516129032256, 22.10062893081761, 21.821428571428573, 23.39655172413793, 13.623239436619718, 14.31294964028777, 16.161616161616163, 13.953191489361702, 14.306513409961688, 12.477272727272728, 27.48148148148148, 28.24778761061947, 30.94267515923567, 31.0, 31.513513513513512, 55.8, 57.0, 54.05434782608695, 50.43859649122807]}

df = pd.DataFrame(data)

# sort df by the vesselID column, and ignore the index, which resets the index 
df = df.sort_values('vesselID', ignore_index=True)

# set the hue order to alphabetical
hue_order = sorted(df.route.unique())

# plot the data
g = sns.catplot(data=df, kind='bar', x=df.index.tolist(), y='dur_min', hue='route', aspect=1.5, dodge=False, hue_order=hue_order)

# set some nicer labels
g.set(xlabel='Vessel ID', ylabel='Duration (Min)')

# extract the the matplotlib.axes.Axes from the FacetGrid
ax = g.axes.flat[0]

# set the xticklabels
_ = ax.set_xticks(ticks=df.index, labels=df.vesselID)

enter image description here

  • df after being sorted
      route  vesselID    dur_min
0     F-V-S         1  13.623239
1    SEA-BR         2  55.800000
2    ANA-SJ        13  39.357895
3    SEA-BR        13  57.000000
4    SEA-BR        15  54.054348
5    SEA-BR        17  50.438596
6    MUK-CL        18  13.953191
7    MUK-CL        19  14.306513
8    ANA-SJ        20  24.747664
9    SEA-BI        25  30.942675
10    F-V-S        28  14.312950
11  ED-KING        30  22.100629
12   SEA-BI        32  31.000000
13    F-V-S        33  16.161616
14  ED-KING        36  21.821429
15   SEA-BI        37  31.513514
16   ANA-SJ        38  33.721429
17   PT-COU        52  27.481481
18   PD-TAL        65  12.477273
19   PT-COU        66  28.247788
20  ED-KING        68  23.396552
21   ANA-SJ        69  37.064516
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
0

This is a slight modification of @jsgounot's answer, which I found very instructive. The modification helps to center the bars on the appropriate xtick.

def change_width(ax, new_value) :
    locs = ax.get_xticks()
    for i,patch in enumerate(ax.patches):
        current_width = patch.get_width()
        diff = current_width - new_value

        # we change the bar width
        patch.set_width(new_value)

        # we recenter the bar
        patch.set_x(locs[i//4] - (new_value * .5))
Jed
  • 638
  • 1
  • 8
  • 17
0

like the answer given for jsgounot but for changing width for horizontal barplots:

def change_width_horizontal(ax, new_value) :
    
    for patch in ax.patches :
        
        current_height = patch.get_height()
        diff = current_height - new_value

        # we change the bar width
        patch.set_height(new_value)

        # we recenter the bar
        patch.set_y(patch.get_y() + diff * .5)
-1

seaborn is a higher level library above matplotlib. While seaborn doesn't have the flexibility to control bar width, matplotlib can do it with one line of code:

plt.bar(data.xcol,data.ycol,4)

Yuchao Jiang
  • 3,522
  • 30
  • 23
-2

Another solution is to modify the box_aspect:

import pylab as plt
import seaborn as sns

tips = sns.load_dataset("tips")
fig, ax = plt.subplots()

ax = sns.barplot(data=tips, ax=ax, x="time", y="tip", hue="sex")

ax.set_box_aspect(10/len(ax.patches)) #change 10 to modify the y/x axis ratio
plt.show()

enter image description here