I want to create 12 density plots, each one with multiple years to compare the distribution of a variable per years. The value 12 corresponds to months. For example, plot #1 would be for January and it would contain density functions for years 2017 and 2018.
This is my code snippet, but I cannot make it working. In particular, I don't know how to iterate over months in such a way that each density plot contains density functions both for 2017 and 2018.
import matplotlib.pyplot as plt
import seaborn as sns
years = [2017,2018]
months = [1,2,3,4,5,6,7,8,9,10,11,12]
axs = plt.figure(figsize=(10, 8), constrained_layout=True).subplots(4, 3)
for ax, month in zip(axs, months):
ax.set_title("Month={}".format(month))
subset = df[(df["Month"] == month)]
sns.distplot(subset["Variable"], hist = False, kde = True, kde_kws = {"linewidth": 3}, label = year)
This is a sample of df
(for simplicity I show only 2 months with 5 rows per year):
Year Month Variable
2017 1 30
2017 1 28
2017 1 28
2017 1 28
2017 1 29
2018 1 15
2018 1 16
2018 1 14
2018 1 16
2018 1 14
2017 2 32
2017 2 29
2017 2 29
2017 2 30
2017 2 29
2018 2 10
2018 2 11
2018 2 11
2018 2 11
2018 2 12
UPDATE:
I tried to use kdeplot
of seaborn:
fig, axes = plt.subplots(figsize=(20,10), ncols=3, nrows=4)
sns.kdeplot(data=df, x="Year", y="Variable", col="Month", ax=axes[3,2])
However, the result looks wrong:
I also tried using displot
with hist
and kde
kinds, but each time I get circles instead of density functions: