Valuerror when checking using pd.notnull() on row[column]. Alternative null check?

Question

In one of my pandas column, I have stored lists.

For example, [url.com, url2.com, url3.com]

Here is a sample printout of the column:

                                       associated_Urls
322  [http://www.hotfrog.ie/business/golf-gifts-ire...
466                    [http://en.netlog.com/A_ni_nha]
433  [https://www.moog.com.cn/literature/ICD/Moog_G...
13   [http://www.schooldays.ie/thread/Childminder-w...
438  [http://tracking.instantcheckmate.com/?a=60&c=...
308  [http://www.wayn.com/profiles/abc123, https://...
361  [https://whoswholegal.com/profiles/abcdef........

In an apply function I check if each of these rows are null, using:


def myfunc(row):

    if pd.notnull(row['associated_Urls']):
                #do something

df.apply(myfunc,axis=1)

However I get the following error:

if pd.notnull(row['associated_Urls']):

ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index 322')

I checked the row at index 322, and it is not null. There is a list with urls inside the list.

What is the best way to check if this particular cell is null?

According to this question it was fixed: Weird null checking behaviour by pd.notnull

But yet I get the error. Any advice is appreciated.

I don't think the problem is with `pd.notnull` but more with the `if` statement with an array, try to add `all` at the end: `if pd.notnull(row['associated_Urls']).all():` — Ben.T, Feb 07 '20 at 17:31
I got the error: `if pd.notnull(row['associated_Urls']).all(): AttributeError: ("'bool' object has no attribute 'all'", 'occurred at index 49')`. I guess this is because this cell actually was completely `null`, there is not even an empty list there. And `.all()` doesn't work on `True` ? — SCool, Feb 07 '20 at 18:01
Yes you are right. To be sure, you don't look for list containing null, just rows with null as value instead of a list? — Ben.T, Feb 07 '20 at 18:28
Yes that's correct. I am just checking if the particular cell is null or not. It doesn't matter if there's a list or dictionary inside or anything. I just want to know if this particular `row[column] location == np.nan.` — SCool, Feb 07 '20 at 18:30

theletz · Answer 1 · 2020-02-07T19:02:51.783

0

Why don’t you use:

notnull_mask = pd.notnull(df['associated_Urls'])
df.loc[notnull_mask :] = df.loc[notnull_mask, :].apply(some_func)

edited Feb 07 '20 at 19:02

answered Feb 07 '20 at 18:57

theletz

1,713
2
16
22

Valuerror when checking using pd.notnull() on row[column]. Alternative null check?

1 Answers1