0

I am working on a csv file and I want to add a ranking column that will reset when the value of the name column is different. This is my original code:

import pandas as pd

l = [{'sku': 'WD-0215', 'name': 'Sofa', 'price': '$1,299.00'}, 
      {'sku': 'WD-1345', 'name': 'Sofa', 'price': '$1,399.00'},
      {'sku': 'WD-0416', 'name': 'Sofa', 'price': '$1,199.00'},
      {'sku': 'sfr20', 'name': 'TV', 'price': '$1,861.00'},
      {'sku': 'sfr40', 'name': 'TV', 'price': '$1,561.00'}, 
      {'sku': 'sfr30', 'name': 'TV', 'price': '$1,961.00'}
    ]

df = pd.DataFrame(l)
df["rank"]=""
for i in range(len(df.values)):    
    df.iloc[i,3]=i+1
    i+=1
df

This will create a rank column with values from 1 to 6, but my expected output should be like this:

   sku      name      price      rank
WD-0215     Sofa      $1299.00    1
WD-1345     Sofa      $1399.00    2
WD-0416     Sofa      $1199.00    3
sfr20       TV        $1861.00    1
sfr40       TV        $1561.00    2
sfr30       TV        $1961.00    3
Shaido
  • 27,497
  • 23
  • 70
  • 73
kali
  • 117
  • 7

1 Answers1

3

You can use groupby together with cumcount to get a cumulative count for each value of name:

df['rank'] = df.groupby('name').cumcount() + 1

Result:

        sku  name      price  rank
0   WD-0215  Sofa  $1,299.00     1
1   WD-1345  Sofa  $1,399.00     2
2   WD-0416  Sofa  $1,199.00     3
3     sfr20    TV  $1,861.00     1
4     sfr40    TV  $1,561.00     2
5     sfr30    TV  $1,961.00     3
Shaido
  • 27,497
  • 23
  • 70
  • 73