2

I am getting the following output, in my pandas dataframe; seemingly because of my seldom null values for certain records:

Cannot convert non-finite values (NA or inf) to integer

How can I write a handler or something in python/pandas to convert my seldom N/A record values to 0 - when they are appearing, so my script can continue; for presumably a fix to this?


Below is my code; with attempt of usage of fillna() - this code addition removes the 'Cannot convert non-finite values..' error in dataframe output.

However it still displays the NaT in the pandas data frame output for those seldom records.

for row in excel_data.itertuples():
            mrn = row.MRN

            if mrn in ("", " ", "N/A", None) or math.isnan(mrn):
                print(f"Invalid record: {row}")
                excel_data = excel_data.drop(excel_data.index[row.Index])
                excel_data = excel_data.fillna(0) # attempt
                continue
            else:
                num_valid_records += 1

        print(f"Processing #{num_valid_records} records")

        return self.clean_data_frame(excel_data)
Dr Upvote
  • 8,023
  • 24
  • 91
  • 204
  • Looking for `df.fillna(0)` ? – mad_ Mar 15 '19 at 17:11
  • 1
    You could drop the NA rows, you could find them with `isnan()` and replace them, you could use `np.nan_to_num`, you could... You get the point. Did you research this? – roganjosh Mar 15 '19 at 17:11
  • @roganjosh yes; I would like to find them and replace them with 0. – Dr Upvote Mar 15 '19 at 17:12
  • for that you might want to look at `fillna()` else you can create a reproducible example. also take a look @ [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – anky Mar 15 '19 at 17:15

1 Answers1

2

Here is an example of using fillna():

df = pd.DataFrame([[1, 2, np.nan],
                   [5, np.nan, 7]],
                   columns=list('ABC'))
df

       A    B    C
    0  1  2.0  NaN
    1  5  NaN  7.0

df.fillna(0)

       A    B    C
    0  1  2.0  0.0
    1  5  0.0  7.0

Nathaniel
  • 3,230
  • 11
  • 18
  • Thanks, looks nice; however my records are outputting NaT not NaN in the pandas dataframe. – Dr Upvote Mar 15 '19 at 17:35
  • @No-Spex both are same. If there is a date, `NaN` is a `NaT` – anky Mar 15 '19 at 17:37
  • 1
    OK..... I tried your suggestion in my code (above, added to OP), and it removes the pandas 'cannot convert error/output' however records are still outputting in dataframe as NaT. – Dr Upvote Mar 15 '19 at 17:40
  • If you can provide an example data frame with dates in it that we can use to run your code on and see the error, that will make it easier to help. – Nathaniel Mar 15 '19 at 18:09