The use case is the following: If in a Pandas Dataframe, several columns are greater than zero, I want to create a new column with value 1
, if the same columns are negative, I wish to set -1
, otherwise I wish to set 0
.
Now, I want to extend the previous. Let's say I want to check for 4
columns the conditions, but I still wish to assign the corresponding value if three of them hold. An example below.
import pandas as pd
import numpy as np
df = pd.DataFrame(
[
[1, 2, 3, 4, 5],
[-1, -2, -3, -4, -5],
[1, 2, -1, -2, -3],
[1, 2, 3, -1, -2]
]
, columns=list('ABCDE'))
def f(df):
dst = pd.Series(np.zeros(df.shape[0], dtype=int))
dst[(df < 0).all(1)] = -1
dst[(df > 0).all(1)] = 1
return dst
columns = ['A', 'B', 'C', 'D']
df['dst'] = f(df[columns])
The code above would return the following DataFrame:
A B C D E dst
0 1 2 3 4 5 1
1 -1 -2 -3 -4 -5 -1
2 1 2 -1 -2 -3 0
3 1 2 3 -1 -2 0
What would be the expected behavior:
- For row
0
,dst
should be1
asA
toD
hold the positive condition. - For row
1
,dst
should be-1
asA
toD
hold the negative condition. - For row
2
,dst
should be0
asA
toD
do not meet any of the conditions. - For row
3
,dst
should be1
asA
toC
hold the positive condition, and onlyD
does not hold.