0

I'm trying to combine a seaborn barplot with a seaborn lineplot. For some reason, I am able to do both seperately, but when combining the two the x-axis is all over the place.

Figure 1 shows the bar plot, Figure 2 shows the line plot (both working fine) and Figure 3 is my attempt at combining both. I've read somewhere that seaborn uses categorical x-axis values, so my feeling is that this is part of the answer. Nevertheless, I can't seem to get it right.

Worth mentioning, my goal of this whole exercise is to get a moving-average line that follows the barplot. So any insights/workarounds to achieve that are also welcome. This is my code:

dfGroup = pd.DataFrame({
    'Year': [1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920],
    'Total Deaths': [0, 0, 2, 3, 2, 3, 4, 5, 6, 7, 8],
    'Total Affected': [0, 1, 0, 2, 3, 6, 9, 8, 12, 13, 15]
    })
    


# Add 3-year rolling average
dfGroup['rolling_3years'] = dfGroup['Total Deaths'].rolling(3).mean().shift(0)
dfGroup = dfGroup.fillna(0)

# Make a smooth line from the 3-year rolling average
from scipy.interpolate import make_interp_spline
X_Y_Spline = make_interp_spline(dfGroup['Year'], dfGroup['rolling_3years'])
 
# Returns evenly spaced numbers over a specified interval.
X_ = np.linspace(dfGroup['Year'].min(), dfGroup['Year'].max(), 500)
Y_ = X_Y_Spline(X_)

# Plot the data
a4_dims = (15, 10)
fig, ax1 = plt.subplots(figsize=a4_dims)

ax1 = sns.barplot(x = "Year", y = "Total Deaths",
             data = dfGroup, color='#42b7bd')
ax2 = ax1.twinx()
ax2 = sns.lineplot(X_, Y_, marker='o')

enter image description here

enter image description here

enter image description here

This is what my dfGroup dataframe looks like:

enter image description here

CrossLord
  • 574
  • 4
  • 20
  • Can you share your `csv_file`, please? – Corralien Sep 28 '21 at 09:10
  • I cannot, it's not completely open access unfortunately – CrossLord Sep 28 '21 at 09:16
  • So, generate a sample for your code to be reproducible. – Corralien Sep 28 '21 at 09:17
  • 1
    I think the reason is that a barplot uses the x-values as categorical values, so your list of years is effectively turned into [0, 1, 2, 3, 4, ...], just with matching (=year) labels. That would mean the distance between the two graphs in the final figure, is about 1800 years (1910 - 120), or 18 times the size of the individual range of the two graphs. That seems to roughly match. – 9769953 Sep 28 '21 at 10:00
  • @9769953 that seems to be it! Thanks, I can easily correct for that – CrossLord Sep 28 '21 at 10:19

0 Answers0