0

I am looping through some data, and if the column doesn't contain any NaNs I want to merge this with my master df. But for some reason, .isna().any() doesn't work in the loop, only when I look at the column separately.

My code currently is:

if df[Stock].isna().any() is False:
    total_df = total_df.merge(df[['Date', Stock]], on='Date', how='left')
else:
    pass

As far as I'm concerned this should filter out any columns containing NaNs. However, it doesn't, it basically appears to do nothing as my df once it's finished contains columns of NaNs as well as the columns I actually want. I have also tried ==True, but to no avail. When I check a column that I know for a fact has NaNs in it using print(df[Stock].isna().any()) the program quite rightly returns True. So my question is why won't this work in a loop?

I've been starring at this for hours now and it's possible I'm doing something incredibly stupid so a fresh pair of eyes might be whats needed but I'm really stumped. Cheers

EDIT:

So for what it's worth, when I do the sum of each column with NaNs in, it returns an actual number. So it appears pandas isn't recognising what are quite clearly nans when I check the df manually. However, I have to do pd.to_numeric on my df data before performing the loop other wise I can't do any maths on the price data at a later stage. Is it possible this is affecting things?

top bantz
  • 585
  • 1
  • 12
  • 29
  • 1
    what is `total_df` before the loop? – Valentino Jun 16 '19 at 17:14
  • Apologies, `total_df` begins as an empty `df` with just a `Date` column, then I loop through to populate it – top bantz Jun 16 '19 at 17:16
  • 1
    Actually, if you want to merge columns which do **not** contains any `NaN` value, you should look for columns where `if df[Stock].isna().any() is False`. Or `if not df[Stock].isna().any()`. – Valentino Jun 16 '19 at 17:24
  • Sorry yes, I've completely ballsed this question up, I have been trying `is False` I changed it to True so see how it responded and mistakenly copy/pasted that. Will edit the question. – top bantz Jun 16 '19 at 17:27

2 Answers2

1

The error is the use of is.

df[Stock].isna().any() is False:

it result always False and nothing is merged. But if you use:

if not df[col].isna().any():

or

df[Stock].isna().any() == False:

it works.

The reason is detailed in this post. is is not equal to ==.

Valentino
  • 7,291
  • 6
  • 18
  • 34
0

Without your code I can't test it, but try putting this code inside a for loop. Not sure if ['Date', 'column'] should be with or without quotes, but try both.

for column in df:
    if df[column].isna().any() is False:
        total_df = total_df.merge(df[['Date', 'column']], on='Date', how='left')
  • Thanks for the suggestion mate, but that adds nothing to the total_df at all now. I think I'm just going to have to stick all the data in and then do `dropna` and `axis=1` although I'd love to know why a simple condition like that doesn't work – top bantz Jun 16 '19 at 17:35