I am working on jupyter notebook on presenting some plots. I have looked everywhere for an answer with no luck. I have the following dataset (I'm providing a sample but the original dataset is larger [64 columns and 32 rows]):
label=['hc','svppa','nfvppa','lvppa']
df ={"id":list(range(1,21,1)), "label": list(np.repeat(label, 5)), "col1":list(np.random.normal(100,10,size=20)), "col2":list(np.random.normal(100,10,size=20)), "col3":list(np.random.normal(100,10,size=20)),
"col4":list(np.random.normal(100,10,size=20)), "col5":list(np.random.normal(100,10,size=20)), "col6":list(np.random.normal(100,10,size=20)), "col7":list(np.random.normal(100,10,size=20))}
df = pd.DataFrame(test_df)
So it looks like this:
Now what I want to do is to plot the probability plots to test for normality using:
columns = list(master_df.columns[2:])
for col in columns:
for label in labels:
stats.probplot(df[df['label']==label][col], dist='norm', plot=plt)
plt.title("Probability plot " + col + " - " + label)
plt.show()
Which creates the plots that I want but they are not 'pretty for presentation'. I wanted to use the subplotting function in matplotlib, but it does not produce the results desired. Given that I am using stats.probplot I can't figure out a way to properly use subplots.
I have tried the following (and different iterations) with no luck:
fig, axes = plt.subplots(nrows=len(columns),4 , figsize= (15,15), sharex=True, sharey=True )
plt.subplots_adjust(hspace=0.5)
axes=axes.ravel()
for n, label in enumerate(label):
for col in columns:
b = stats.probplot(df[df['label']==label][col], dist='norm', plot=plt(axes[n]))
Any ideas will be much appreciated!