I have list of integers and want to get frequency of each integer. This was discussed here
The problem is that approach I'm using gives me frequency of floating numbers when my data set consist of integers only. Why that happens and how I can get frequency of integers from my data?
I'm using pyplot.histogram to plot a histogram with frequency of occurrences
import numpy as np
import matplotlib.pyplot as plt
from numpy import *
data = loadtxt('data.txt',dtype=int,usecols=(4,)) #loading 5th column of csv file into array named data.
plt.hist(data) #plotting the column as histogram
I'm getting the histogram, but I've noticed that if I "print" hist(data)
hist=np.histogram(data)
print hist(data)
I get this:
(array([ 2323, 16338, 1587, 212, 26, 14, 3, 2, 2, 2]),
array([ 1. , 2.8, 4.6, 6.4, 8.2, 10. , 11.8, 13.6, 15.4,
17.2, 19. ]))
Where the second array represent values and first array represent number of occurrences.
In my data set all values are integers, how that happens that second array have floating numbers and how should I get frequency of integers?
UPDATE:
This solves the problem, thank you Lev for the reply.
plt.hist(data, bins=np.arange(data.min(), data.max()+1))
To avoid creating a new question how I can plot columns "in the middle" for each integer? Say, I want column for integer 3 take space between 2.5 and 3.5 not between 3 and 4.