2

I am new to binning in python and I am trying to create bins with property prices. I would like my last bin to be 4000000 + to reduce empty bins.

Here is my code:

bins = np.arange(0, 13000000, 1000000)
print(bins)
labels = pd.cut(data['PRICE'], bins, right= True)
labels = labels.value_counts().sort_index()
labels

The output is

(0, 1000000]            869
(1000000, 2000000]       88
(2000000, 3000000]       20
(3000000, 4000000]        4
(4000000, 5000000]        1
(5000000, 6000000]        1
(6000000, 7000000]        0
(7000000, 8000000]        0
(8000000, 9000000]        0
(9000000, 10000000]       0
(10000000, 11000000]      0
(11000000, 12000000]      1

How can I reduce the bins to 4000000 and over to have a frequency of 3?

FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
d789w
  • 357
  • 5
  • 19
  • Possible duplicate of [Binning and then combining bins with minimum number of observations?](https://stackoverflow.com/questions/38591000/binning-and-then-combining-bins-with-minimum-number-of-observations) – Josh Friedlander Aug 28 '18 at 10:53

1 Answers1

3

This should work here, you have to manually set up bin boundaries:

bins = [1000000,2000000,3000000,4000000,12000000]
print(bins)
labels = pd.cut(data['PRICE'], bins, right= True)
labels = labels.value_counts().sort_index()
labels

Also, have a look here for different answers on that topic:

Binning column with python pandas

Piotrek
  • 1,400
  • 9
  • 16