27

I have a dataframe which has timestamp and its datatype is object.

0    2020-07-09T04:23:50.267Z
1    2020-07-09T11:21:55.536Z
2    2020-07-09T11:23:18.015Z
3    2020-07-09T04:03:28.581Z
4    2020-07-09T04:03:33.874Z
Name: timestamp, dtype: object

I am not aware of the format of the datetime in the above dataframe. I applied pd.to_datetime to the above column where the datatype is changed as datetime64[ns, UTC].

df['timestamp'] = pd.to_datetime(df.timestamp)

Now the dataframe looks in this way,

0   2020-07-09 04:23:50.267000+00:00
1   2020-07-09 11:21:55.536000+00:00
2   2020-07-09 11:23:18.015000+00:00
3   2020-07-09 04:03:28.581000+00:00
4   2020-07-09 04:03:33.874000+00:00
Name: timestamp, dtype: datetime64[ns, UTC]

I want to convert the above datetime64[ns, UTC] format to normal datetime.

For example,
2020-07-09 04:23:50.267000+00:00  to 2020-07-09 04:23:50

Can anyone explain me what is the meaning of this 2020-07-09T04:23:50.267Z representation and also how to convert this into datetime object?

merkle
  • 1,585
  • 4
  • 18
  • 33
  • 2
    this will help https://stackoverflow.com/questions/16628819/convert-pandas-timezone-aware-datetimeindex-to-naive-timestamp-but-in-certain-t – deadshot Jul 15 '20 at 15:04
  • `2020-07-09T04:23:50.267Z` is an [ISO8601](https://en.wikipedia.org/wiki/ISO_8601) date/time string. `Z` stands for zulu time and means UTC. – FObersteiner Jul 15 '20 at 15:55

2 Answers2

53

To remove timezone, use tz_localize:

df['timestamp'] = pd.to_datetime(df.timestamp).dt.tz_localize(None)

Output:

                timestamp
0 2020-07-09 04:23:50.267
1 2020-07-09 11:21:55.536
2 2020-07-09 11:23:18.015
3 2020-07-09 04:03:28.581
4 2020-07-09 04:03:33.874
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • 1
    since the input already refers to UTC, I'd suggest to convert to None, not localize, see my answer [here](https://stackoverflow.com/a/62656878/10197418). – FObersteiner Jul 15 '20 at 15:58
4

Return of to_datetime depends [confusingly to me] on the type of input:

list-like: DatetimeIndex
Series: Series of datetime64 dtype
scalar: Timestamp

So the following fails

df["Time"] = pd.to_datetime(df["StringArray"])
xm = df["Time"] < pd.to_datetime("12/29/2020  9:09:37 PM")

but the following works just fine

df["Time"] = pd.to_datetime(df["StringArray"])
xm = df["Time"] < pd.to_datetime("12/29/2020  9:09:37 PM", utc=True)

This may help you avoid timezone problems. Regards,

Tunneller
  • 381
  • 2
  • 13
  • I am struggling to make a comparison using np.isnat work if df["Time"] has any values of NaT in datetime64[ns, UTC]. (if `xm = ((np.isnat(df["Time"])) | (df["Time"] < pd.to_datetime("12/29/2020 9:09:37 PM", utc=True)))` I receive `TypeError: ufunc 'isnat' is only defined for np.datetime64 and np.timedelta64.` so, tz_localize may be required? (note that pandas has a pd.NaT constant, but not a pd.isnat() or pd.Timestamp.isnat() function) – mpag Jul 07 '23 at 18:41