I have gone from different posts on this forum, but I cannot find an answer to the behaviour I am seeing.
I have a csv file which header has many entries with 300 points each. For each fiel (column of the csv file) I would like to plot an histogram. The x axis contains the elements on that column and the y-axis should have the number of samples that fall inside each bin. As I have 300 points, the total number of samples in all bins added together should be 300, so the y-axis should go from 0 to, let's say, 50 (just an example). However, the values are gigantic (400e8), which makes not sense.
sample of the table point mydata
1 | 250.23e-9
2 | 250.123e-9
... | ...
300 | 251.34e-9
Please check my code, below. I am using pandas to open the csv and Matplotlib for the rest.
df=pd.read_csv("/home/pcardoso/raw_data/myData.csv")
# Figure parameters
figPath='/home/pcardoso/scripts/python/matplotlib/figures/'
figPrefix='hist_' # Prefix to the name of the file.
figSuffix='_something' # Suffix to the name of the file.
figString='' # Full string passed as the figure name to be saved
precision=3
num_bins = 50
columns=list(df)
for fieldName in columns:
vectorData=df[fieldName]
# statistical data
mu = np.mean(vectorData) # mean of distribution
sigma = np.std(vectorData) # standard deviation of distribution
# Create plot instance
fig, ax = plt.subplots()
# Histogram
n, bins, patches = ax.hist(vectorData, num_bins, density='True',alpha=0.75,rwidth=0.9, label=fieldName)
ax.legend()
# Best-fit curve
y=mlab.normpdf(bins, mu, sigma)
ax.plot(bins, y, '--')
# Setting axis names, grid and title
ax.set_xlabel(fieldName)
ax.set_ylabel('Number of points')
ax.set_title(fieldName + ': $\mu=$' + eng_notation(mu,precision) + ', $\sigma=$' + eng_notation(sigma,precision))
ax.grid(True, alpha=0.2)
fig.tight_layout() # Tweak spacing to prevent clipping of ylabel
# Saving figure
figString=figPrefix + fieldName +figSuffix
fig.savefig(figPath + figString)
plt.show()
plt.close(fig)
In summary, I would like to know how to have the y-axis values right.
Edit: 6 July 2020
Edit 08 June 2020 I would like the density estimator to follow the plot like this: