1

I'm trying to figure out how to use the value from last column of Y with multiple conditions to derive Z and P.

  • when Y is between 2 to 6 = last column Z + 5,
  • when Y is 1, next column Z = 0 + 5,
  • when Y is between 2 to 6, column Q = 0,
  • when Y is 1, column Q = X multiple by Z

Original DF:

╔════╦═══╗
║ X  ║ Y ║
╠════╬═══╣
║ 29 ║ 5 ║
║ 28 ║ 4 ║
║ 32 ║ 3 ║
║ 29 ║ 3 ║
║ 26 ║ 1 ║
║ 38 ║ 5 ║
║ 25 ║ 2 ║
║ 33 ║ 3 ║
║ 25 ║ 3 ║
║ 25 ║ 5 ║
║ 40 ║ 1 ║
║ 30 ║ 6 ║
║ 31 ║ 3 ║
║ 38 ║ 5 ║
╚════╩═══╝

Output needed:

╔════╦═══╦═════╦══════╗
║ X  ║ Y ║  Z  ║  P   ║
╠════╬═══╬═════╬══════╣
║ 29 ║ 5 ║  5  ║    0 ║
║ 28 ║ 4 ║ 10  ║    0 ║
║ 32 ║ 3 ║ 15  ║    0 ║
║ 29 ║ 3 ║ 20  ║    0 ║
║ 26 ║ 1 ║ 25  ║  650 ║
║ 38 ║ 5 ║  5  ║    0 ║
║ 25 ║ 2 ║ 10  ║    0 ║
║ 33 ║ 3 ║ 15  ║    0 ║
║ 25 ║ 3 ║ 20  ║    0 ║
║ 25 ║ 5 ║ 25  ║    0 ║
║ 40 ║ 1 ║ 30  ║ 1200 ║
║ 30 ║ 6 ║  5  ║    0 ║
║ 31 ║ 3 ║ 10  ║    0 ║
║ 38 ║ 5 ║ 15  ║    0 ║
╚════╩═══╩═════╩══════╝

i've did some research and found that shifted is used, however, i can't figure out how to add the other conditions

data = {'X':[29,28,32,29,26,38,25,33,25,25,40,30,31,38], 'Y':[5,4,3,3,1,5,2,3,3,5,1,6,3,5]}

Many thanks

ManOnTheMoon
  • 587
  • 2
  • 11
  • It depends on The condition you need. But this answer may help you: [link](https://stackoverflow.com/a/59790721/11261546) – Ivan Jan 20 '20 at 08:37
  • Switch "when" with "if". – Guy Jan 20 '20 at 08:37
  • 1
    Is possible `1` consecutive value in `Y` ? If yes, whats happen? – jezrael Jan 20 '20 at 09:00
  • Yes,it is possible for 1 consecutive value in Y. When 1 is consecutive value in Y, it is still be next column Z = 0 + 5. Therefore, every time Y is 1, the next column Z will reset and start adding 5 till another 1 appears in Y – ManOnTheMoon Jan 20 '20 at 13:55

2 Answers2

2

Use GroupBy.cumcount by helper Series with shifted mask created by Series.cumsum then add Series.add with 1 and multiple by 5 with Series.mul and for next column use numpy.where:

m = df['Y'].eq(1)
df['Z'] = df.groupby(m.shift().bfill().cumsum()).cumcount().add(1).mul(5)
df['P'] = np.where(m, df.X.mul(df.Z), 0)
print (df)
     X  Y   Z     P
0   29  5   5     0
1   28  4  10     0
2   32  3  15     0
3   29  3  20     0
4   26  1  25   650
5   38  5   5     0
6   25  2  10     0
7   33  3  15     0
8   25  3  20     0
9   25  5  25     0
10  40  1  30  1200
11  30  6   5     0
12  31  3  10     0
13  38  5  15     0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

Try this

s = df.Y.eq(1).shift(fill_value=True).cumsum()
df['Z'] = df.groupby(s).Y.cumcount().add(1) * 5
df['P'] = df.X.where(df.Y.eq(1), 0) * df.Z

In [756]: df
Out[756]:
     X  Y   Z     P
0   29  5   5     0
1   28  4  10     0
2   32  3  15     0
3   29  3  20     0
4   26  1  25   650
5   38  5   5     0
6   25  2  10     0
7   33  3  15     0
8   25  3  20     0
9   25  5  25     0
10  40  1  30  1200
11  30  6   5     0
12  31  3  10     0
13  38  5  15     0
Andy L.
  • 24,909
  • 4
  • 17
  • 29