I have a pandas dataframe as below
Dominant_Topic word appearance
Topic 0 aaaawww 50
Topic 0 aacn 100
Topic 0 aaren 20
Topic 0 aarongoodwin 200
Topic 1 aaronjfentress 10
Topic 1 aaronrodger 20
Topic 1 aasmiitkap 30
Topic 2 aavqbketmh 10
Topic 2 ab 10
Topic 2 abandon 1
I want to get a dense rank for each partition, the partition column being the column named Dominant_Topic
. The ranking should be based in descending manner on how many times the words in each partition appear. So the output would look like -
Dominant_Topic word appearance dense_rank
Topic 0 aaaawww 50 3
Topic 0 aacn 100 2
Topic 0 aaren 20 4
Topic 0 aarongoodwin 200 1
Topic 1 aaronjfentress 10 3
Topic 1 aaronrodger 20 2
Topic 1 aasmiitkap 30 1
Topic 2 aavqbketmh 10 1
Topic 2 ab 10 1
Topic 2 abandon 1 2
How do I achieve this in Pandas ?
The SQL equivalent looks something like this -
select *, dense_rank() over( partition by dominant_topic order by appearance desc)
from table