0

In one of my pandas column, I have stored lists.

For example, [url.com, url2.com, url3.com]

Here is a sample printout of the column:

                                       associated_Urls
322  [http://www.hotfrog.ie/business/golf-gifts-ire...
466                    [http://en.netlog.com/A_ni_nha]
433  [https://www.moog.com.cn/literature/ICD/Moog_G...
13   [http://www.schooldays.ie/thread/Childminder-w...
438  [http://tracking.instantcheckmate.com/?a=60&c=...
308  [http://www.wayn.com/profiles/abc123, https://...
361  [https://whoswholegal.com/profiles/abcdef........

In an apply function I check if each of these rows are null, using:


def myfunc(row):

    if pd.notnull(row['associated_Urls']):
                #do something

df.apply(myfunc,axis=1)

However I get the following error:

if pd.notnull(row['associated_Urls']):

ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index 322')

I checked the row at index 322, and it is not null. There is a list with urls inside the list.

What is the best way to check if this particular cell is null?

According to this question it was fixed: Weird null checking behaviour by pd.notnull

But yet I get the error. Any advice is appreciated.

SCool
  • 3,104
  • 4
  • 21
  • 49
  • I don't think the problem is with `pd.notnull` but more with the `if` statement with an array, try to add `all` at the end: `if pd.notnull(row['associated_Urls']).all():` – Ben.T Feb 07 '20 at 17:31
  • I got the error: `if pd.notnull(row['associated_Urls']).all(): AttributeError: ("'bool' object has no attribute 'all'", 'occurred at index 49')`. I guess this is because this cell actually was completely `null`, there is not even an empty list there. And `.all()` doesn't work on `True` ? – SCool Feb 07 '20 at 18:01
  • Yes you are right. To be sure, you don't look for list containing null, just rows with null as value instead of a list? – Ben.T Feb 07 '20 at 18:28
  • 1
    Yes that's correct. I am just checking if the particular cell is null or not. It doesn't matter if there's a list or dictionary inside or anything. I just want to know if this particular `row[column] location == np.nan.` – SCool Feb 07 '20 at 18:30

1 Answers1

0

Why don’t you use:

notnull_mask = pd.notnull(df['associated_Urls'])
df.loc[notnull_mask :] = df.loc[notnull_mask, :].apply(some_func)
theletz
  • 1,713
  • 2
  • 16
  • 22