1

I have a pandas dataframe like below:

x y z
1 2 3
na 1 4
na na 5

Now I want to add another column a whose value depend on x, y and z. If x is available then a would be "yes". If it is na then it will check y. If y is available then, a would be "no", otherwise a would be same as z(if it is available otherwise it will be 0). I have the following function in R:

cur_sta <- function(data){

    sta <- ifelse(!is.na(data$x),"yes",    
        ifelse(!is.na(data$y),"no",    
        ifelse(!is.na(data$z),data$z,0)))

}

How can I achieve the same in python?

EDIT:

I tried the following:

conditions = [
        (not pd.isnull(data["x"].item())),
        (not pd.isnull(data["y"].item())),
        (not pd.isnull(data["z"].item()))]
    choices = ['yes', 'no', data["z"]]
    data['col_sta'] = np.select(conditions, choices, default='0')

but I am getting the following error:

ValueError: can only convert an array of size 1 to a Python scalar

How can I fix this?

M--
  • 25,431
  • 8
  • 61
  • 93
Ank
  • 1,864
  • 4
  • 31
  • 51
  • @jezrael The question you linked did not help me much. So please reopen this question. Thanks! – Ank Aug 19 '19 at 08:18

1 Answers1

1

Use Series.notna for test non missing values:

conditions = [data["x"].notna(),
              data["y"].notna(),
              data["z"].notna()]
choices = ['yes', 'no', data["z"]]
data['col_sta'] = np.select(conditions, choices, default='0')
print (data)
     x    y  z col_sta
0  1.0  2.0  3     yes
1  NaN  1.0  4      no
2  NaN  NaN  5       5
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252