3

I have a data frame df with columns ID and N1, I would like to calculate column N2, with logic first value should be equal to N1 for each ID and the next value is 0.888/0.999 and so on. And similarly for next ID. Can we do this WITHOUT using for loop in pandas

ID  N1  N2
1111    0.999   0.999
1111    0.888   0.888888889
1111    0.777   0.875
1111    0.666   0.857142857
1111    0.555   0.833333333
1111    0.444   0.8
1111    0.333   0.75
2222    0.998   0.998
2222    0.887   0.888777555
2222    0.776   0.874859076
2222    0.665   0.856958763
2222    0.554   0.833082707
2222    0.443   0.799638989
2222    0.332   0.749435666
2222    0.221   0.665662651
BigDataScientist
  • 1,045
  • 5
  • 17
  • 37

2 Answers2

5

This is 1 plus the percentage change

df.assign(N2=df.groupby('ID').N1.pct_change().add(1).fillna(df.N1))

      ID     N1        N2
0   1111  0.999  0.999000
1   1111  0.888  0.888889
2   1111  0.777  0.875000
3   1111  0.666  0.857143
4   1111  0.555  0.833333
5   1111  0.444  0.800000
6   1111  0.333  0.750000
7   2222  0.998  0.998000
8   2222  0.887  0.888778
9   2222  0.776  0.874859
10  2222  0.665  0.856959
11  2222  0.554  0.833083
12  2222  0.443  0.799639
13  2222  0.332  0.749436
14  2222  0.221  0.665663
piRSquared
  • 285,575
  • 57
  • 475
  • 624
3

Yes, you can use groupby(), transform() and shift() then fillna(1) to allow for that first value.

df['N2'] = df.groupby("ID")['N1'].transform(lambda x: x/x.shift(1).fillna(1))
df

      ID     N1        N2
0   1111  0.999  0.999000
1   1111  0.888  0.888889
2   1111  0.777  0.875000
3   1111  0.666  0.857143
4   1111  0.555  0.833333
5   1111  0.444  0.800000
6   1111  0.333  0.750000
7   2222  0.998  0.998000
8   2222  0.887  0.888778
9   2222  0.776  0.874859
10  2222  0.665  0.856959
11  2222  0.554  0.833083
12  2222  0.443  0.799639
13  2222  0.332  0.749436
14  2222  0.221  0.665663
Scott Boston
  • 147,308
  • 15
  • 139
  • 187