Trying to create a predetermined answer from conditions relating to 2 other columns within a dataframe

Question

I am trying to create a dataframe.

df = pd.DataFrame(columns=["Year", "Fuel", "Status", "Sex", "Service", "Expected"])

The other columns contain data created using np.random.

Within the "expected" column I would like to input Pass or Fail depending on a few conditions. If the mileage is less 100000 and if the service is yes then it will pass, otherwise its a fail.

This is what I have so far

df["Expected"]  = df.loc[(df['Mileage']< 100000) | (df['Service'] == 'Yes', "Pass", "Fail")]

It is bringing up the error message

ValueError: operands could not be broadcast together with shapes (500,) (3,)

I have filled the other columns with 500 lines of data. But I am not sure what the 3 relates to. Possibly the Yes, Pass, Fail values.

I also tried df['Expected'] = np.where(df ["Mileage"] < 132352, ['Service'] == "Yes",'Pass','Fail') which kind of worked.

Am I on the wrong track?

Any help or pointers would be appreciated.

user1558604 · Answer 1 · 2019-12-10T15:33:55.410

1

I'd create a function that takes a pd.Series object as the only argument, and then returns the value for that cell. Then use pd.apply(lambda row: your_function(row), axis=1). So:

def your_function(row):
    if row["Mileage"] <132352 and row["Service"] == "Yes" :# fill in your other conditions here
        return "Pass"
    else:
        return "Fail"

df["Expected"] = df.apply(lambda row: your_function(row), axis=1)

edited Dec 10 '19 at 15:33

answered Dec 10 '19 at 14:14

user1558604

947
6
20

At the last line pd.apply creates the" AttributeError: module 'pandas' has no attribute 'apply"' so I changed it to df.apply which then creates "ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 1')". I then used `def fun(row): if row["Mileage"] <100000 and df['Service'].any(): return "Pass" else: return "Fail" df["Expected"] = df.apply(lambda row: fun(row), axis=1)` Thank you. – bexi Dec 10 '19 at 14:57
Yes, sorry. it should be `df.apply`. I'm confused on what you are doing with the `.any()`. Could you explain that a bit more? – user1558604 Dec 10 '19 at 15:00
would it make sense to do `df["service"] = "Yes"` – user1558604 Dec 10 '19 at 15:02
If I use just `def fun(row): if row["Mileage"] <100000 and df['Service']:` I get the error message ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 2'). So I put in .any(): after I looked up https://stackoverflow.com/questions/53830081/python-pandas-the-truth-value-of-a-series-is-ambiguous and came across You are comparing two pd.Series, or a pd.Series with a value, so you might have multiple True and multiple False values, you have to do instead:if (data == ask_minute['lastUpdated']).any() – bexi Dec 10 '19 at 15:03
oh, sorry. Use `row["Service"]`. `row` in your function is a pd.Series of one row of your df. – user1558604 Dec 10 '19 at 15:29
1

Works perfectly. Thank you for all the time you put into this. – bexi Dec 10 '19 at 17:11

Engels Leonhardt · Answer 2 · 2019-12-10T15:53:58.723

1

You could simply fill the Expected column with 'Fail':

df['Expected'] = 'Fail'

And then:

df.at[df[(df['Mileage']<100000) & (df['Service'] == 'Yes')].index,'Expected'] = 'Pass'

edited Dec 10 '19 at 15:53

answered Dec 10 '19 at 14:18

Engels Leonhardt

107
9

Returns Fail for every cell. – bexi Dec 10 '19 at 15:29
Are you sure there are rows in your dataframe that fills the requirements? This should work just fine. – Engels Leonhardt Dec 10 '19 at 15:36
Yes. Im looking at one that has 43000 and Yes for a service but its still a fail. – bexi Dec 10 '19 at 15:39
I have managed to find the error in my code. It should work now. – Engels Leonhardt Dec 10 '19 at 15:54
Works perfectly. Thank you for taking the time to go through this. – bexi Dec 10 '19 at 17:11

Trying to create a predetermined answer from conditions relating to 2 other columns within a dataframe

2 Answers2