Use cut
. Also was added 2 another groups for values bellow 0.47
and above .56
, because value 0.566318
in sample data.
bins = [-np.inf, .47, 0.5, .53, .56, np.inf]
labels=[0,1,2,3,4]
df['label'] = pd.cut(df['value'], bins=bins, labels=labels)
print (df)
value label
0 0.486903 1
1 0.520908 2
2 0.530904 3
3 0.483284 1
4 0.475935 1
5 0.502831 2
6 0.541743 3
7 0.566318 4
8 0.500073 2
9 0.510959 2
10 0.546008 3
11 0.551682 3
12 0.534396 3
13 0.501554 2
14 0.541277 3
Numpy solution:
bins = [-np.inf, .47, 0.5, .53, .56, np.inf]
df['label'] = np.array(bins).searchsorted(df['value']) - 1
print (df)
value label
0 0.486903 1
1 0.520908 2
2 0.530904 3
3 0.483284 1
4 0.475935 1
5 0.502831 2
6 0.541743 3
7 0.566318 4
8 0.500073 2
9 0.510959 2
10 0.546008 3
11 0.551682 3
12 0.534396 3
13 0.501554 2
14 0.541277 3
Last write to csv
by to_csv
:
df.to_csv('myfile', index=False)