1

Aim: check if my bad_outcomes set includes df['Outcome'] pd dataframe column values. If the set does contain these values I want to assign them to a new variable landing_outcome with the value of 0. If not I assign landing_outcome a value of 1.

I am able to search a column df['Outcome'] and check if the values are in my set called 'bad_outcomes' using isin.

df[df['Outcome'].isin (bad_outcomes)]

This works. Then I try to put this in an if statement

if df[df['Outcome'].isin (bad_outcomes)]:
    landing_outcome = 0

This gives me a Value error:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Where I am going wrong? Is using isin the best way to do this?

I checked python manual for if statements and could't find an obvious syntax issue, I searched this forum for the error message (there are many posts but I couldn't see one for my use case). I'm new, I hope this is ok to ask.

James
  • 33
  • 4
  • Your statement returns an array of Booleans, which in itself doesn't have a true/false value. You should do exactly what the message suggests: `if df[df['Outcome'].isin (bad_outcomes)].any():`. – Tim Roberts Feb 08 '22 at 22:24

2 Answers2

3

Try using .loc

df.loc[df['Outcome'].isin(bad_outcomes), "landing_outcome"] = 0
df.loc[~df['Outcome'].isin(bad_outcomes), "landing_outcome"] = 1

If this helps, do approve the solution and upvote it.

Raymond Toh
  • 779
  • 1
  • 8
  • 27
  • Hey Raymond, thanks for taking the time. df.loc solution gives the set the value of 0 but I'm trying to give each item its own value of 0. If not in the set i assign 1. Then i can add it as a new col in the df. I think my error was trying to us isin with an if statement. Posting my update below. – James Feb 09 '22 at 19:31
  • @James i have updated the answer. Do check it out! – Raymond Toh Feb 10 '22 at 04:03
  • thanks @Raymond, that's great. I can't upvote you because my reputation is 2 points short of the 15 i need to vote, but it works great and thanks for the update. – James Feb 17 '22 at 21:48
0

I found this answer on condition statements on [codegrepper][1] which referenced this resource on stackoverflow

Which linked back to stackoverflow here: Pandas conditional creation of a series/dataframe column

Using this approach my solution was:

landing_class=[0 if outcome in bad_outcomes else 1 for outcome in df['Outcome']] 
James
  • 33
  • 4