0

Trying to determine if npy.nan is present in a pandas.Series

1. The code I've created to replicate and test what I'm trying to accomplish:

ser = pnd.Series(['1',None, 2, npy.nan], index=['2001','2002','2003','2004'])  
serTest = ser.isin([npy.nan]) == True
  • serTest assigned

    2001 False
    2002 False
    2003 False
    2004 True

2. Code that is showing inconsistent behavior:

(data's coming from an csv file, The World Bank)

I'm attempting to read any relevant values from a csv file that're of type, npy.nan. To verify the pertinent cells for its type and to troubleshoot the problem I'm experiencing, I'm using the following code, where data is a panda series type, containing the index (year), and a float (gross domestic product):

for flt in data:
    print('is nan {}'.format(npy.isnan(flt)))
  • slice of data

    2006 3.89552e+11
    2007 4.25065e+11
    ...
    2014 4.63903e+11
    2015 NaN

For the cell in question (the 2015 GDP), the code returns, which is what I expect:

is nan True

However, when I attempt to return a boolean series, like the replicated code, bullet 1 above, I get:

2006 False  
2007 False  
... 
2014 False 
2015 False

where 2015 should be True, based on the slice of data's 2015 NaN value.

Final comments, after getting the inconsistent results, even though it's automatically assigned when reading the file into a pandas.DataFrame, in an alternative attempt to isolate the problem, I assigned npy.nan to the cell in question through the DataFrame. The results, just mentioned, are the same.

Please help. :-)

  • 1
    `df['that_column'].isnull()` ? – Bharath M Shetty Dec 10 '18 at 02:52
  • Agree with @Dark. But `data.isin([np.nan]) == True` works fine for me. – jpp Dec 10 '18 at 02:53
  • @LLendrickRobinson, you should provide some data which exhibits your problem, i.e. a **[mcve]**. As such, we are just guessing why your code might not work. – jpp Dec 10 '18 at 02:55
  • @Lendrick Robinson make sure the datatype of that Nan is float. If the `NaN` present in the dataframe is of type string then `isnull` will fail to detect that. – Bharath M Shetty Dec 10 '18 at 02:58
  • @Dark and @jpp, I modified the for loop above. The results are; all values're float: **`is nan True and data type `** I also tried `df['that column']isnull()` The results're, where index 12's the cell I'm concerned with. It returned True: `0 False` `1 False` `2 False` `...` `12 True` `13 False` `14 False` – Lendrick Robinson Dec 10 '18 at 04:58
  • @Dark and @jpp, I figured out the issue. the series was cast as an object. I'd to convert it to a float, using `data.astype(np.float64)`. I didn't bother to look at the type of series; I was focusing on the individual elements. That is where my confusion was. I also referenced [convert to float64](https://stackoverflow.com/questions/28277137/how-to-convert-datatypeobject-to-float64-in-python). Thanks for your clues without my having a Minimal, Complete, and Verifiable example, great intuition. I'm new to python and this's my first post. I'll get better. :-) – Lendrick Robinson Dec 11 '18 at 00:38
  • @Dark and @jpp, in addition to the last comment and the last one here, instead of using `data.isin([np.nan])==True`, I omitted the for loop above that was used for testing and tried `np.isnan(data)`; it iterated over the series and returned a series with a boolean of `true` for the cell with which I was concerned. Again, thanks a bunch for the clues. – Lendrick Robinson Dec 11 '18 at 02:29

0 Answers0