I make bins out of my column using pandas' pd.qcut()
. I would like to, then apply smoothing by corresponding bin's mean value.
I generate my bins with something like
pd.qcut(col, 3)
For example,
Given the column values [4, 8, 15, 21, 21, 24, 25, 28, 34]
and the generated bins
Bin1 [4, 15]: 4, 8, 15
Bin2 [21, 24]: 21, 21, 24
Bin3 [25, 34]: 25, 28, 34
I would like to replace the values with the following means
Mean of Bin1 (4, 8, 15) = 9
Mean of Bin2 (21, 21, 24) = 22
Mean of Bin3 (25, 28, 34) = 29
Therefore:
Bin1: 9, 9, 9
Bin2: 22, 22, 22
Bin3: 29, 29, 29
making the final dataset: [9, 9, 9, 22, 22, 22, 29, 29, 29]
How can one also add a column with closest bin boundaries?
Bin1: 4, 4, 15
Bin2: 21, 21, 24
Bin3: 25, 25, 34
making the final dataset: [4, 4, 15, 21, 21, 24, 25, 25, 34]
very similar to this question which is for R