Discretization of continuous attributes using np.histogram - how to apply on a new data point?

Question

continuing How to do discretization of continuous attributes in sklearn?

After I "learned" my bins from train data, using np.histogram(A['my_var']) how do I apply it on my test set? as in which bin is the my_var attribute of each data point? Both my train and test data are in pandas data frames, if it matters.

Thanks

score 0 · Answer 1 · answered Sep 15 '15 at 13:18

0

oops. it's easy.

hist = np.histogram(A['my_var'])
A.loc[:, 'my_bin'] = np.digitize(A['my_var'], hist[1])

answered Sep 15 '15 at 13:18

ihadanny

4,377
7
45
76

Discretization of continuous attributes using np.histogram - how to apply on a new data point?

1 Answers1