I am trying to fetch multiple years of data (e.g., from 2005-2007) from a database that contains values in float format.
2005 values (sorted): [0.512, 0.768, 1, 1.5..., 100]
2006 values (sorted): [0.288, 0.512..., 300, 350]
and so on.
I want to generate a CDF plot (using ax.hist() function) that enables me to plot each year into a single graph. My current code looks like this:
num_bins = 100
fig, ax = plt.subplots(figsize=(8, 4))
years = ['2005', '2006', '2007']
for year in years:
df = pd.read_sql_query(query, conn) #sorted
n, bins, patches = ax.hist(df.values, num_bins, normed = 1, histtype='step', cumulative=True, label=str(year))
ax.grid(True)
ax.legend(loc='right')
ax.set_xlabel('Values')
ax.set_ylabel('CDF plot')
plt.show()
However, this gives me a single plot with multiple CDF histograms but varying x-axis (unsorted). My x-axis values are: 0.512, 0.768, 1, 1.5..., 100, 0.288, ..., 300, 350
. It appends the newly found values in the second year to the first year x-axis values instead of re-plotting using the same scale.
How can I ensure that all CDF plots get generated for a common and dynamically varying scale (end-result) such as: 0.288, 0.512, 0.768, 1, 1.5..., 100..., 300, 350
.