4

would like some help with the following problem. I currently have a panda dataframe with 3 columns - test1, test2, test3

What I hope to achieve is result in the result_column, where the logic will be:

1) If value in test1 AND test2 > 0, then return value of test3

2) Else If value test1 AND test2 < 0, then return NEGATIVE value of test3

3) Otherwise return 0

  test1  test2  test3  result_column
0    0.5    0.1   1.25    1.25
1    0.2   -0.2   0.22       0
2   -0.3   -0.2   1.12   -1.12
3    0.4   -0.3   0.34       0
4    0.5      0   0.45       0

This is my first time posting a question on python and pandas. Apologies in advance if the formatting here is not optimum. Appreciate any help I can get!

2 Answers2

4

I think need numpy.select with conditions chained by & (AND) or select all tested columns by subset [[]], compare ant test by DataFrame.all:

m1 = (df.test1 > 0) & (df.test2 > 0)
#alternative
#m1 = (df[['test1', 'test2']] > 0).all(axis=1)

m2 = (df.test1 < 0) & (df.test2 < 0)
#alternative
#m2 = (df[['test1', 'test2']] < 0).all(axis=1)

df['result_column'] = np.select([m1,m2], [df.test3, -df.test3], default=0)
print (df)
   test1  test2  test3  result_column
0    0.5    0.1   1.25           1.25
1    0.2   -0.2   0.22           0.00
2   -0.3   -0.2   1.12          -1.12
3    0.4   -0.3   0.34           0.00
4    0.5    0.0   0.45           0.00
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks, this actually works well! it, however, gave a "SettingwithCopyWarning" error though. Although I am still able to run this block. Do I need to worry about this error? C:\blp\BQuant\environments\bqnt-0.42.2\lib\site-packages\bqnb\kernelapp.py:4: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead – Singapore 123 Mar 15 '18 at 07:26
  • I think problem should be another - I guess [need copy](https://stackoverflow.com/a/46728170/2901002) if filter. Or maybe problem with `set values` - check [here](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – jezrael Mar 15 '18 at 07:30
  • @Singapore123 - Can you show me code above `df['result_column'] = np.select([m1,m2], [df.test3, -df.test3], default=0)` ? – jezrael Mar 15 '18 at 07:54
  • I used the one you suggested - m1 = (df.test1 > 0) & (df.test2 > 0) m2 = (df.test1 < 0) & (df.test2 < 0) and it works well – Singapore 123 Mar 15 '18 at 08:54
3

Clever use of np.sign and logic
If both > 0 or both < 0 then product is 1, otherwise product is -1 or 0. If both > 0 then sign of sum is 1. If both < 0 then sign of sum is -1, else 0. The product of these things is exactly what we want.

v = np.sign(df[['test1', 'test2']].values)
df.assign(result_column=v.prod(1) * np.sign(v.sum(1)) * df.test3 + 0)

   test1  test2  test3  result_column
0    0.5    0.1   1.25           1.25
1    0.2   -0.2   0.22           0.00
2   -0.3   -0.2   1.12          -1.12
3    0.4   -0.3   0.34           0.00
4    0.5    0.0   0.45           0.00
piRSquared
  • 285,575
  • 57
  • 475
  • 624