Why can I use 'if pd.isnull():' within a function inside df.apply but otherwise not

Asked Oct 20 '22 at 15:12

Active Oct 20 '22 at 15:12

Viewed 11 times

I have a dataframe:

import pandas as pd

d = {"a": 1, "b": None}
df = pd.DataFrame([d])

And I have a function that I want to apply to the DF.

def check(df):
    a_val = df["a"]
    b_val = df["b"]
    if pd.isnull(a_val) or pd.isnull(b_val):
        print(123)

Now if I run

check(df)

I get: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

But if I run

df.apply(lambda x: check(x), axis = 1)

It works fine and prints 123?

asked Oct 20 '22 at 15:12

Steve Ahlswede

The difference is that `df["a"]` is every row for that column, which is a Series. But when we do a lambda, we are iterating every row so `df["a"]` is a scalar (the value in that column for this particular row). Because, in your example, the dataframe is a single row, it's extra confusing, but a single row Series is still a Series. – JNevill Oct 20 '22 at 15:16
When you call `check(df)`, `a_val` in `pd.isnull(a_val)` is a Series. When you call `df.apply(lambda x: check(x), axis = 1)`, `x` is a row and `a_val` is a value. – Ynjxsjmh Oct 20 '22 at 15:17
Check *The alternatives mentioned in the Exception are more suited if you encountered it when doing if or while.* part. – Ynjxsjmh Oct 20 '22 at 15:22

0 Answers0