1

continuing How to do discretization of continuous attributes in sklearn?

After I "learned" my bins from train data, using np.histogram(A['my_var']) how do I apply it on my test set? as in which bin is the my_var attribute of each data point? Both my train and test data are in pandas data frames, if it matters.

Thanks

Community
  • 1
  • 1
ihadanny
  • 4,377
  • 7
  • 45
  • 76

1 Answers1

0

oops. it's easy.

hist = np.histogram(A['my_var'])
A.loc[:, 'my_bin'] = np.digitize(A['my_var'], hist[1])
ihadanny
  • 4,377
  • 7
  • 45
  • 76