22

I'm new to Pandas. I downloaded and installed Anaconda. Then I tried running the following code via the Spyder app:

import pandas as pd
import numpy as np

train = pd.read_csv('/Users/Ben/Documents/Kaggle/Titanic/train.csv')
train

Although this prints the dataframe as I expected, it also shows these errors

//anaconda/lib/python3.4/site-packages/pandas/core/format.py:1969: RuntimeWarning: invalid value encountered in greater
  has_large_values = (abs_vals > 1e8).any()
//anaconda/lib/python3.4/site-packages/pandas/core/format.py:1970: RuntimeWarning: invalid value encountered in less
  has_small_values = ((abs_vals < 10 ** (-self.digits)) &
//anaconda/lib/python3.4/site-packages/pandas/core/format.py:1971: RuntimeWarning: invalid value encountered in greater
  (abs_vals > 0)).any()

Why am I getting these errors?

EDIT: I just tested the above code in an IPython notebook and it works without errors. So, is there something wrong with my Spyder installation? Any help would be appreciated.

EDIT2: After some testing, I can read the first 5 rows of the CSV without getting the warning. So, I suspect a NaN in the 6th row for a float64 type column is triggering the warning.

cchamberlain
  • 17,444
  • 7
  • 59
  • 72
Ben
  • 20,038
  • 30
  • 112
  • 189
  • never seen this before but I use WinPython, could you try reinstalling anaconda – EdChum May 29 '15 at 07:48
  • @EdChum Reinstalled Anaconda and I'm still getting this error – Ben May 29 '15 at 13:33
  • For anyone interested, you can download the train.csv dataset [here](https://www.kaggle.com/c/titanic/data) – Ben May 29 '15 at 20:35
  • 4
    There's a discussion about this on github [here](https://github.com/pydata/pandas/issues/9950). You can work around this issue by doing `pd.set_option('display.float_format', lambda x:'%f'%x)` – Ben Dec 19 '15 at 16:57
  • Can you provide a sample of the original dataset? Or link to it? – unique_beast Mar 24 '16 at 02:49
  • It is indeed the NaN values causing the error - see answer to similar question here: http://stackoverflow.com/questions/34955158/what-might-be-the-cause-of-invalid-value-encountered-in-less-equal-in-numpy – Melissa Nov 24 '16 at 11:59
  • I had similar problem for percentile with NaN values.. by changing it to `nanpercentile` it worked perfectly without error. `nanpercentile` ignore nans. – ihightower Nov 11 '19 at 07:16

1 Answers1

33

I have the same error and have decided that it is a bug. It seems to be caused by the presence of NaN values in a DataFrame in Spyder. I have uninstalled and reinstalled all packages and nothing has effected it. NaN values are supported and are completely valid in DataFrames especially if they have a DateTime index.

In the end I have settled for suppressing this warnings as follows.

import warnings
warnings.simplefilter(action = "ignore", category = RuntimeWarning)
wadge
  • 518
  • 4
  • 8