9

Have been working through options listed in Converting between datetime, Timestamp and datetime64; however, numpy's isnat() seems to not recognize a datetime object, or I'm missing some other kind of datetime object that the function requires for input.

Here's an overview of the dataframe:

>>> time_data.head()
     Date          Name               In AM              Out AM  \
0  2017-12-04  AUSTIN LEWIS 1900-01-01 07:03:11 1900-01-01 12:01:50   
1  2017-12-05  AUSTIN LEWIS 1900-01-01 05:24:07 1900-01-01 12:08:21   
2  2017-12-06  AUSTIN LEWIS 1900-01-01 11:58:32                 NaT   
3  2017-12-07  AUSTIN LEWIS 1900-01-01 08:31:23 1900-01-01 12:49:51   
4  2017-12-11  AUSTIN LEWIS 1900-01-01 06:55:21 1900-01-01 12:02:08   

            In PM              Out PM Sick Time  
0 1900-01-01 12:28:52 1900-01-01 17:34:53       NaT  
1 1900-01-01 12:35:12 1900-01-01 16:15:17       NaT  
2                 NaT 1900-01-01 23:59:01       NaT  
3 1900-01-01 13:18:34 1900-01-01 18:10:35       NaT  
4 1900-01-01 12:30:49 1900-01-01 17:39:54       NaT  

>>> time_data.dtypes
Date                 object
Name                 object
In AM        datetime64[ns]
Out AM       datetime64[ns]
In PM        datetime64[ns]
Out PM       datetime64[ns]
Sick Time    datetime64[ns]
dtype: object

>>> type(time_data['In AM'][3])
<class 'pandas._libs.tslib.Timestamp'>

>>> type(time_data['In AM'][3].to_datetime())
<type 'datetime.datetime'>

if np.isnat(time_data['Out AM'][row].to_datetime()) & np.isnat(time_data['In PM'][row].to_datetime()):

Throws "ValueError: ufunc 'isnat' is only defined for datetime and timedelta"

What am I missing here?!

Andrew Pederson
  • 147
  • 1
  • 9

2 Answers2

6

Ugh, that's a really bad error message! np.isnat ("is not a time") only works with numpy's datetimes. The typical use for a ufunc is with an array of np.datetime64 or np.timedelta64 dtype:

>>> dt = datetime.now()
>>> np.isnat(np.array([dt], dtype=np.datetime64))
array([False])
>>> np.isnat(np.array([dt], dtype=object))
TypeError: ufunc 'isnat' is only defined for datetime and timedelta.

Refer to the docs for supported input types.

wim
  • 338,267
  • 99
  • 616
  • 750
  • I was just doing the same observation, +1 `>>> import numpy as np >>> np.isnat >>> np.isnat(np.datetime64("NaT")) True >>> nat = np.datetime64("NaT") >>> nat numpy.datetime64('NaT') >>>` @OP Just convert them using `np.datetime64` – F. Leone Jan 10 '18 at 19:22
  • 1
    Bizarrely, this works on everything except NaT! `>>> np.isnat(np.datetime64(time_data['Out AM'][2])) Traceback (most recent call last): File "", line 1, in np.isnat(np.datetime64(time_data['Out AM'][2])) ValueError: cannot convert float NaN to integer >>> time_data['Out AM'][2] NaT` – Andrew Pederson Jan 10 '18 at 21:03
  • `np.isnat(np.datetime64('NaT'))` returns True for me. Are you sure you don't have NaN in there not NaT ? – wim Jan 10 '18 at 21:06
  • Yup. Certain. Depending on how I index and what conversions I use, the field seems to jump from Datetime.datetime, to Pandas timestamp to numpy datetime64. Ugh. – Andrew Pederson Jan 10 '18 at 21:32
1

You could also just convert everything from the beginning using pd.to_datetime on your wanted datetime columns:

df = pd.DataFrame({
    'date' : [
        '2017-12-04',
        '2017-12-05',
        '2017-12-06',
        '2017-12-07',
        '2017-12-11'
    ],
    'name' : ['AUSTIN LEWIS'] * 5,
    'in_am' : [
        '1900-01-01 07:03:11',
        '1900-01-01 05:24:07',
        '1900-01-01 11:58:32',
        '1900-01-01 08:31:23',
        '1900-01-01 06:55:21'
    ],
    'out_am' : [
        '1900-01-01 12:01:50',
        '1900-01-01 12:08:21',
        '',
        '1900-01-01 12:49:51',
        '1900-01-01 12:02:08'
    ],
    'in_pm' : [
        '1900-01-01 12:28:52',
        '1900-01-01 12:35:12',
        '',
        '1900-01-01 13:18:34',
        '1900-01-01 12:30:49'
    ],
    'out_pm' : [
        '1900-01-01 17:34:53',
        '1900-01-01 16:15:17',
        '1900-01-01 23:59:01',
        '1900-01-01 18:10:35',
        '1900-01-01 17:39:54'
    ],
    'sick_time' : [''] * 5
})

input

# all dtypes should be object
df.dtypes

dtypes

# convert to datetimes
for col in df.columns.drop('name').tolist():
    df[col] = pd.to_datetime(df[col])

# name should be only object
df.dtypes

new dtypes

# np.isnat should now work
np.isnat(df.loc[:, df.dtypes == 'datetime64[ns]'])

output

Ian Thompson
  • 2,914
  • 2
  • 18
  • 31
  • Ahhh, finally! Thank you @Ian-thompson! I had already used `time_data[nt] = time_data[nt].apply(pd.to_datetime, format='%H:%M:%S.%f')` to convert all the datetime columns, though for some reason, np.isnan() only likes .loc indexing :shrug: `>>> np.isnat(time_data.loc[2:2, 'Out AM']) 2 True Name: Out AM, dtype: bool >>> time_data.loc[2, 'Out AM'] NaT >>> np.isnat(time_data.loc[2, 'Out AM']) Traceback (most recent call last): File "", line 1, in np.isnat(time_data.loc[2, 'Out AM']) ValueError: ufunc 'isnat' is only defined for datetime and timedelta.` – Andrew Pederson Jan 10 '18 at 21:07