0

Suppose I have a data frame

             Sym    P    P_2  R
01.01.2020   AAPL   100  115  0.2
01.01.2020   AA     200  205  0.4
01.01.2020   MM     300  290  0.5
02.01.2020   AAPL   101  116  0.3
02.01.2020   AA     201  206  0.2
02.01.2020   MM     298  300  0.5
03.01.2020   AA     200  205  0.3
03.01.2020   MM     300  305  0.2

How can I make a new column ID, so that I have a unique id for each Sym?

I would expect

             Sym   ID  P    P_2  R
01.01.2020   AAPL  1   100  115  0.2
01.01.2020   AA    2   200  205  0.4
01.01.2020   MM    3   300  290  0.5
02.01.2020   AAPL  1   101  116  0.3
02.01.2020   AA    2   201  206  0.2
02.01.2020   MM    3   298  300  0.5
03.01.2020   AA    2   200  205  0.3
03.01.2020   MM    3   300  305  0.2

The answer provided in assign unique ID to each unique value in group after pandas groupby does not give what I expect - comes Error: Cannot set a DataFrame with multiple columns to the single column ID

  • 1
    Do you try `df['ID'] = df.groupby(level=0)['Sym'].transform(lambda x: pd.factorize(x)[0]) + 1` ? – jezrael Jan 02 '23 at 13:16
  • Yeap, I tried but unfortunately the results are not what I would like. I would like to assert IDs for each Sym, and this ID should not change over date (data.index). So, for each date, I should get 1 for AAPL, 2 for AA, and so on... but to mention that for some dates I don't have data, if it has some impact on solution. – Beginner_01 Jan 02 '23 at 13:27
  • 1
    So need `df['ID'] = pd.factorize(df['Sym'])[0] + 1` ? – jezrael Jan 02 '23 at 13:28

0 Answers0