Return the max value in a new column for each repeating value in the other column

Question

Background: I was hoping to generate a new column named: datasample based on another column named: end_bin from a table.

Question: Is there a way to return the max value in each row of the new column if the value is repeated in the previous column.

Expected result:

end_bin	datasample
6	1
8	1
10	1
2	3
3	1
2	3
2	3

I couldnt find a method to do this in pandas, any help is appreciated:)

So the first value is simply the number of occurances of that value in end_bin. i.e 6 occured just once in end_bin and so did others except 2 which occured 3 times in total across end_bin. Hence 3 is displayed across all rows with corresponding value as 2. — Pranav Arora, Mar 29 '22 at 09:48
I hope this is what you are looking for. _df = pd.DataFrame(data={"end_bin": [6, 8, 10, 2, 3, 2, 2]}) count_ser = _df.value_counts() _df["datasample"] = _df["end_bin"].replace(count_ser) _df — JAbr, Mar 29 '22 at 10:04

score 1 · Accepted Answer · answered Mar 29 '22 at 09:49

1

Your question is unclear, but it looks like you want the size per group:

df['datasample'] = df.groupby('end_bin')['end_bin'].transform('size')

Output:

   end_bin  datasample
0        6           1
1        8           1
2       10           1
3        2           3
4        3           1
5        2           3
6        2           3

answered Mar 29 '22 at 09:49

mozway

194,879
13
39
75

and not forget convert to wiki – jezrael Mar 29 '22 at 09:51
whats happens? why answered ? I am surprised. – jezrael Mar 29 '22 at 09:56

Return the max value in a new column for each repeating value in the other column

1 Answers1