-4

I have a table looking kind of like this - the last one I have added manually to explain wanted outcome

Image of tablesetup

Create one column for breed combining the columns p1-p3 where if true in p1_dog it add p1 value, if false in p1_dog but true in p2_dog add p2 value, if false look in p3_dog and if true add p3 value, else print - 'not a dog'.

I have tried using a for loop and also played around with np.select. Perhaps a np.where should work but dont know how

breeds = []
for row in df_copy:
    if ['p1_dog']:
        breeds.append(df_copy['p1'])
    elif ['p2_dog']:
        breeds.append(df_copy['p2'])
    elif ['p3_dog']:
        breeds.append(df_copy['p3'])
    else:
        breeds.append('not a dog')

This only give me the full list of values in each stor

Using np.select which I found here pandas if else conditions on multiple columns

This gives me a list of booleans

df_copy['breed'] = np.select([df_copy.p1_dog == True , df_copy.p2_dog == True], [df_copy.p1_dog, df_copy.p2_dog], default=df_copy.p3_dog)
  • So your output should be of the form `(p1_dog if p1) else (p2_dog if p2) else (p3_dog if p3) else "not a dog"`? Basically, the first column where the matching `p` column is true? – B Remmelzwaal Feb 17 '23 at 12:13

1 Answers1

0

Definitely not the prettiest and not nicely maintainable, but in this case it would work:

import pandas as pd
import numpy as np

df = pd.DataFrame({"p1": ["Chihuahua", "paper_towel", "basset", "Irish_terrier", "Pembroke"],
 "p1_dog": [True, False, True, True, False],
 "p2": ["malamute", "Labrador_retriever", "English_springer", "Irish_setter", "Cardigan"],
 "p2_dog": [True, True, True, True, False],
 "p3": ["kelpie", "spatula", "German_short-hared_pointer", "Chesapeake_Bay_Retriever", "Chihuahua"],
 "p3_dog": [True, False, True, True, True]
})

df['Wanted result'] = np.where(df["p1_dog"], df["p1"], np.where(df["p2_dog"], df["p2"], np.where(df["p3_dog"], df["p3"], "not a dog")))

print(df.to_string())

Basically chaining np.where() where if the condition for column 1 is True, it outputs the p1 name, but if False, checks the next boolean column, and so on.

Output:

[5 rows x 6 columns]
p1  p1_dog                  p2  p2_dog                          p3  p3_dog       Wanted result
0      Chihuahua    True            malamute    True                      kelpie    True           Chihuahua
1    paper_towel   False  Labrador_retriever    True                     spatula   False  Labrador_retriever
2         basset    True    English_springer    True  German_short-hared_pointer    True              basset
3  Irish_terrier    True        Irish_setter    True    Chesapeake_Bay_Retriever    True       Irish_terrier
4       Pembroke   False            Cardigan   False                   Chihuahua    True           Chihuahua

Also, please post your dataframe as raw text and not as an image. I had to copy the dataframe word for word, and some names were even cut off.

B Remmelzwaal
  • 1,581
  • 2
  • 4
  • 11