10

My dataframe contains both NaT and NaN values

    Date/Time_entry      Entry      Date/Time_exit       Exit   
0   2015-11-11 10:52:00  19.9900    2015-11-11 11:30:00  20.350 
1   2015-11-11 11:36:00  20.4300    2015-11-11 11:38:00  20.565 
2   2015-11-11 11:44:00  21.0000    NaT                  NaN        
3   2009-04-20 10:28:00  13.7788    2009-04-20 10:46:00  13.700

I want to fill NaT with dates and NaN with numbers. Fillna(4) method replaces both NaT and NaN with 4. Is it possible to differentiate between NaT and NaN somehow?

My current workaround is to df[column].fillna()

cs95
  • 379,657
  • 97
  • 704
  • 746
Biarys
  • 1,065
  • 1
  • 10
  • 22

2 Answers2

20

Since NaTs pertain to datetime columns, you can exclude them when applying your filling operation.

u = df.select_dtypes(exclude=['datetime'])
df[u.columns] = u.fillna(4)
df

      Date/Time_entry    Entry      Date/Time_exit    Exit
0 2015-11-11 10:52:00  19.9900 2015-11-11 11:30:00  20.350
1 2015-11-11 11:36:00  20.4300 2015-11-11 11:38:00  20.565
2 2015-11-11 11:44:00  21.0000                 NaT   4.000
3 2009-04-20 10:28:00  13.7788 2009-04-20 10:46:00  13.700

Similarly, to fill NaT values only, change "exclude" to "include" in the code above.

u = df.select_dtypes(include=['datetime'])
df[u.columns] = u.fillna(pd.to_datetime('today'))
df

      Date/Time_entry    Entry             Date/Time_exit    Exit
0 2015-11-11 10:52:00  19.9900 2015-11-11 11:30:00.000000  20.350
1 2015-11-11 11:36:00  20.4300 2015-11-11 11:38:00.000000  20.565
2 2015-11-11 11:44:00  21.0000 2019-02-17 16:11:09.407466   4.000
3 2009-04-20 10:28:00  13.7788 2009-04-20 10:46:00.000000  13.700
cs95
  • 379,657
  • 97
  • 704
  • 746
2

Try something like this, using pandas.DataFrame.select_dtypes:

>>> import pandas as pd, datetime, numpy as np
>>> df = pd.DataFrame({'a': [datetime.datetime.now(), np.nan], 'b': [5, np.nan], 'c': [1, 2]})
>>> df
                           a    b  c
0 2019-02-17 18:06:15.231557  5.0  1
1                        NaT  NaN  2
>>> fill_dt = datetime.datetime.now()
>>> fill_value = 4
>>> dt_filled_df = df.select_dtypes('datetime').fillna(fill_dt)
>>> dt_filled_df
                           a
0 2019-02-17 18:06:15.231557
1 2019-02-17 18:06:36.040404
>>> value_filled_df = df.select_dtypes('int').fillna(fill_value)
>>> value_filled_df
   c
0  1
1  2
>>> dt_filled_df.columns = [col + '_notnull' for col in dt_filled_df]
>>> value_filled_df.columns = [col + '_notnull' for col in value_filled_df]
>>> df = df.join(value_filled_df)
>>> df = df.join(dt_filled_df)
>>> df
                           a    b  c  c_notnull                  a_notnull
0 2019-02-17 18:06:15.231557  5.0  1          1 2019-02-17 18:06:15.231557
1                        NaT  NaN  2          2 2019-02-17 18:06:36.040404
blacksite
  • 12,086
  • 10
  • 64
  • 109