I am trying to make an array of N+1 bins for a distribution of N discrete scores.
I assumed numpy's arange
could be used for that. However, the function gives me odd values, which have a significant effect on the resulting numpy histograms. Here's a minimal example:
n = 10
a = np.arange(0, 1.01 + 1/n, 1/n)
print(a)
for i in a:
print(i)
[0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. 1.1]
0.0
0.1
0.2
0.30000000000000004
0.4
0.5
0.6000000000000001
0.7000000000000001
0.8
0.9
1.0
1.1
The fact that simply printing the array outputs seemingly normal values is extra misleading. This is a big issue if I want to use this array as a bins
argument to numpy's histogram()
function because my values are k/10 decimals. In particular, all data points with the value of 0.7 will be placed in the [0.6000000000000001, 0.7000000000000001]
bin, whereas I'd expect them to be inside [0.7, 0.8]
, as per np.histogram() documentation.
The question is whether this is a bug or a feature.