1

Suppose I have a dataframe which currently has data like this:

   T week
0  T-1
1  T-1
2  T-1
3  T-1
4  T-2
5  T-2
6  T-2
7  T-3
8  T-3
9  T-3
10 T-3

I want to group the index in such a way that it corresponds with the T- group I am dealing with, for example this is the dataframe I want:

   T week
1  T-1
2  T-1
3  T-1
4  T-1
1  T-2
2  T-2
3  T-2
1  T-3
2  T-3
3  T-3
4  T-3

Note how the index starts from 1 again (instead of 0) when there is a new T-group.

I tried to code this but it didn't really work. Could use some help!

import os,xlrd,pandas as pd

df = pd.read_excel(r'dir\file.xlsx')
book = xlrd.open_workbook(r'dir\file.xlsx')
sheet = book.sheet_by_name('Sheet1')

t_value = None
next_t = None
tabcount = 0
idx = 1
i = 1

while i!=sheet.nrows:
    t_value = df['T Week'][i]
    next_t = df['T Week'][i+1]
    if t_value == next_t:
        tabcount+=1
        df.at[i,'Num'] = idx
        idx+=1
    else:
        idx = 0
        df.at[i, 'Num'] = idx
    i+=1
cottontail
  • 10,268
  • 18
  • 50
  • 51
dexter27
  • 55
  • 1
  • 5

1 Answers1

2

Use groupby and cumcount. We'll all use add to adjust the cumcount by 1:

df.index = df.groupby('T week').cumcount().add(1)

out]

  T week
1    T-1
2    T-1
3    T-1
4    T-1
1    T-2
2    T-2
3    T-2
1    T-3
2    T-3
3    T-3
4    T-3
Chris Adams
  • 18,389
  • 4
  • 22
  • 39
  • Thank you it works! I have 2 questions though, how did you come across those functions as i tried to look around alot before I coded it and could you help me in figuring out what went wrong in my original code in terms of logic? – dexter27 Mar 06 '20 at 16:14
  • 1
    hey @dexter27, glad it helped. A great resource for learning pandas is Wes Mckinney's (the creator of pandas) [python for data analysis](https://www.oreilly.com/library/view/python-for-data/9781491957653/). That books helped me a lot when I started using pandas. As for your own logic, I'm useless with `while` loops I'm afraid. In general, you should try to avoid looping with dataframes, there is almost always a better (and vectorized) solution... check out [this](https://stackoverflow.com/a/55557758/10201580) answer here for more detail on the subject. – Chris Adams Mar 06 '20 at 16:40