I have a pandas dataframe with some columns in it. The column I am interested in is something like this,
df['col'] = ['A', 'A', 'B', 'C', 'B', 'A']
I want to make another column say, col_count
such that it shows count value in col
from that index to the end of the column.
The first A
in the column should have a value 3 because there is 3 occurrence of A
in the column from that index. The second A
will have value 2
and so on.
Finally, I want to get the following result,
col col_count
0 A 3
1 A 2
2 B 2
3 C 1
4 B 1
5 A 1
How can I do this effectively in pandas.? I was able to do this by looping through the dataframe and taking a unique count of that value for a sliced dataframe.
Is there an efficient method to do this.? Something without loops preferable.
Another part of the question is, I have another column like this along with col
,
df['X'] = [10, 40, 10, 50, 30, 20]
I want to sum up this column in the same fashion I wanted to count the column col
.
For instance, At index 0, I will have 10 + 40 + 20 as the sum. At index 1, the sum will be 40 + 20. In short, instead of counting, I want to sum up another column.
The result will be like this,
col col_count X X_sum
0 A 3 10 70
1 A 2 40 60
2 B 2 10 40
3 C 1 50 50
4 B 1 30 30
5 A 1 20 20