-1

I'm quite new to python and have the following question

I have a large dataframe with machine data and serial numbers. I need to create a new column with the order of Events for each 'SerialNumber'.

import pandas as pd

data = {'SerialNumber': ['1111', '2222', '1111', '3333', '1111'],
        'FirmwareVersion': ['03.01.01', '00.00.00', '03.01.01', '02.01.01', '03.01.01'],
        'InstrInstalldate': ['23.11.2011', '23.11.2011', '23.11.2011', '23.11.2011', '23.11.2011'],
        'EventDate': ['06.01.2014 15:36', '05.01.2014 04:26', '05.01.2014 04:26', '05.01.2014 04:25', '31.12.2015 23:37'],
        'Event': ['BS:Cold Boot Start occured', 'Instrument hardware error 10700.', 'CBS:Cold Boot Start occured', 'Instrument hardware error 10700.', 'Instrument battery low, charging required.']
        }

df = pd.DataFrame(data)
print(df)

Desired output should look like


 SerialNumber FirmwareVersion         EventDate InstrInstalldate  Count
0         1111        03.01.01  06.01.2014 15:36       23.11.2011  1
1         2222        00.00.00  05.01.2014 04:26       23.11.2011  1
2         1111        03.01.01  05.01.2014 04:26       23.11.2011  2
3         3333        02.01.01  05.01.2014 04:25       23.11.2011  1
4         1111        03.01.01  31.12.2015 23:37       23.11.2011  3
Tom
  • 59
  • 5

1 Answers1

1

Use groupby + cumcount:

df['Count'] = df.groupby('SerialNumber').cumcount() + 1

df
  SerialNumber  ... Count
0         1111  ...     1
1         2222  ...     1
2         1111  ...     2
3         3333  ...     1
4         1111  ...     3

[5 rows x 6 columns]
Psidom
  • 209,562
  • 33
  • 339
  • 356