1

I want to create some sort of a sum-up of my data, that can be of the form,

A    B
------
10   2
20   4
5    6
5    8
9    14

I want to group these rows based on the values of column B by quantizing them in groups of a range of 5. So if some value of column B falls in the range 1-5, the row will belong to that group. In the example here rows 1 and 2 fall in group 1-5, whereas 3 and 4 belong to the group 6-10. Each group will then become only one row containing the values of column A. So we would end up with,

A    B
------
15  1-5
5   6-10
9   11-15

How can this be done with pandas, without iterating over each row?

ealiaj
  • 1,525
  • 1
  • 15
  • 25
  • 1
    Possible duplicate of [Binning column with python pandas](https://stackoverflow.com/questions/45273731/binning-column-with-python-pandas) – Rahul Agarwal Oct 18 '18 at 11:22

1 Answers1

1

Use cut with arange for bins with list comprehension for labels and then aggregate mean:

bins = np.arange(0, 16, 5)
labels = [f'{i+1}-{j}' for i, j in zip(bins[:-1], bins[1:])] 

binned = pd.cut(df['B'], bins=bins, labels=labels)
df1 = df.groupby(binned)['A'].mean().reset_index()[['A','B']]
print (df1)

    A      B
0  15    1-5
1   5   6-10
2   9  11-15
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252