12

I have a column in my dataset which represents a date in ms and sometimes its values is nan (actually my columns is of type str and sometimes its valus is 'nan'). I want to compute the epoch in days of this column. The problem is that when doing the difference of two dates:

(pd.to_datetime('now') - pd.to_datetime(np.nan)).days

if one is nan it is converted to NaT and the difference is of type NaTType which hasn't the attribute days.

In my case I would like to have nan as a result.

Other approach I have tried: np.datetime64 cannot be used, since it cannot take as argument nan. My data cannot be converted to int since int doesn't have nan.

Ruggero Turra
  • 16,929
  • 16
  • 85
  • 141
  • Why not just filter the column first: `df.loc[df['date'].notnull(), 'days'] = (pd.to_datetime('now') -df['date']).days` – EdChum Aug 28 '15 at 11:11
  • 1
    I don't want to filter, I want to have all the entries, since actually I am creating a new column of my dataset. As I said I want `days=nan` as a result in these cases. – Ruggero Turra Aug 28 '15 at 11:13

2 Answers2

10

It will just work even if you filter first:

In [201]:
df = pd.DataFrame({'date':[dt.datetime.now(), pd.NaT, dt.datetime(2015,1,1)]})
df

Out[201]:
                        date
0 2015-08-28 12:12:12.851729
1                        NaT
2 2015-01-01 00:00:00.000000

In [203]:
df.loc[df['date'].notnull(), 'days'] = (pd.to_datetime('now') - df['date']).dt.days
df

Out[203]:
                        date  days
0 2015-08-28 12:12:12.851729    -1
1                        NaT   NaN
2 2015-01-01 00:00:00.000000   239
EdChum
  • 376,765
  • 198
  • 813
  • 562
2

For me upgrading to pandas 0.20.3 from pandas 0.19.2 helped resolve this error.

pip install --upgrade pandas
alif
  • 795
  • 9
  • 12