I am trying to use pd.cut
to create specific buckets. This works for most data but there is a subset that it puts into nan
where there is a clear value. I have provided an example df
numbers difference_interval
0 0.000000e+00 nan
1 3.263739e-03 nan
2 3.637279e-02 nan
3 5.308298e-03 nan
4 -1.139971e-01 nan
5 nan nan
Here is the code I used to create the intervals:
bins = pd.IntervalIndex.from_tuples([(-1, -.2), (-.2, -.1), (-.1, -.05), (-.05, 0), (0, .05), (0.05, .1), (0.1, .2), (0.2, 1)])
col = 'numbers'
df = (df.dropna(subset=col)
.assign(difference_interval= lambda df: pd.cut(df[col].values, bins).sort_values().astype(str)))
df.query('difference_interval == "nan"')
Why would this happening?