Encountering ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

Question

I have a function

def cruise_fun(speed, accl, acmx, dcmx):
    count = 0
    index = []
    for i in range(len(speed.dropna())):
        if ((speed[i]>40) & (accl[i]<acmx*0.2) & (accl[i]>dcmx*0.2)):
            count +=1
            index.append(i)
                    
    return count, index

This function is being called in the following statement

cruise_t_all, index_all =cruise_fun(all_data_speed[0], acc_val_all[0], acc_max_all, decc_max_all)

all_data_speed and acc_val_all are two dataframes of 1 column and 38287 rows. acc_max_all and decc_max_all are two float64 values. I have tried to implement solutions provided in stackoverflow as much as I could. I have used both and and &. I can not get around the problem.

Does this answer your question? [Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()](https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o) — qmeeus, May 10 '21 at 13:14
@qmeeus The part where it is advised to used `&` instead of `and` did not work for me, I have tried it. — Tamoghna Bhattacharya, May 10 '21 at 13:48

qmeeus · Accepted Answer · 2021-05-10T15:11:14.310

0

You are using pandas in the wrong way. You should not loop over all the rows like you do. You can concatenate the columns provided and then check the conditions:

def cruise_fun(speed, accl, acmx, dcmx):
    df = pd.concat([speed.dropna(), accl], axis=1)
    df.columns = ["speed", "accl"]
    mask = (df["speed"] > 40) & df["accl"].between(dcmx * .2, acmx * .2, inclusive=False)
    return mask.sum(), df[mask].index

NB: A few assumptions that I make:

I assume that you do not have conflicts for your column names, otherwise the concat will not work and you will need to rename your columns first
I assume that the index from speed.dropna() and accl match but I would not be surprised if it is not the case. You should make sure that this is fine, or better: store everything in the same dataframe

edited May 10 '21 at 15:11

answered May 10 '21 at 14:14

qmeeus

2,341
2
12
21

Tried it, the same problem is happening. Using logical and throws `Can only compare identically-labeled Series objects` error. – Tamoghna Bhattacharya May 10 '21 at 14:23
what did you try exactly? have you read the remarks at the end of my post? can you also post the exact error message, including the line of code that fails? I assumed it's the if statement but I start to doubt it... – qmeeus May 10 '21 at 14:27
There is no conflict in the column names as such, `concat` works just fine. The indices do match, both are of 38287 rows. I tried the function the way you suggested. I encountered the same error I was encountering previously. The error message is `ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().` It occurs at `mask = ((df["speed"] > 40.0) and (decc_max_all * .2 < df["accl"] < acc_max_all * .2))` this line. edit: I am seeing the edited code just now. Did not work as well, same error. – Tamoghna Bhattacharya May 10 '21 at 14:41
Now I am getting `Can only compare identically-labeled Series objects`. I am positive that labels are identical. I had already tried logical `and`. – Tamoghna Bhattacharya May 10 '21 at 15:33
The curious thing is, the function `cruise_fun` works just fine with other data of similar kind. Both the previous one and the alternative suggested. Only this specific data set is troubling me. But, then again, the data is fine, it works perfectly with other kinds of calculations. – Tamoghna Bhattacharya May 10 '21 at 15:42
can you share a small dataset for which it does not work? – qmeeus May 10 '21 at 18:15
I have solved the issue, turns out, the way I calculated `acc_max_all` and `decc_max_all` was generating a Series object as output. Obviously, the comparison was throwing error. – Tamoghna Bhattacharya May 11 '21 at 11:35

Encountering ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

1 Answers1