5

I would like to put a Pandas Data Frame column into datetime format from datetime64. This works on an an individual basis. In particular the following works fine:

t = dt['time'].values[0]
datetime.utcfromtimestamp(t.astype(int)/1000000000)

However, when I try to do this to the entire column

dt['datetime'] = dt['time'].apply(lambda x: datetime.utcfromtimestamp(x.astype(int)/1000000000))

I get the following error:

pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()

<ipython-input-26-5950d82979b4> in <lambda>(x)
      1 print(type(dt['time'].values[0]))
      2 
----> 3 dt['datetime'] = dt['time'].apply(lambda x: datetime.utcfromtimestamp(x.astype(int)/1000000000))
      4 t = dt['time'].values[0]
      5 print(t)

AttributeError: 'Timestamp' object has no attribute 'astype'

What am I doing wrong? How can I convert my column to datetime and/or make a new column in datetime format?

Here is the info for the dataframe:

info

helloB
  • 3,472
  • 10
  • 40
  • 87

2 Answers2

8

You can convert Series of dtype datetime64[ns] to a NumPy array of datetime.datetime objects by calling the .dt.to_pydatetime() method:

In [75]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 252 entries, 0 to 251
Data columns (total 1 columns):
time    252 non-null datetime64[ns]<--the `time` column has dtype `datetime64[ns]`
dtypes: datetime64[ns](1)
memory usage: 2.0 KB

In [77]: df.head()
Out[77]: 
        time
0 2009-01-02
1 2009-01-05
2 2009-01-06
3 2009-01-07
4 2009-01-08


In [76]: df['time'].dt.to_pydatetime()[:5]
Out[76]: 
array([datetime.datetime(2009, 1, 2, 0, 0),
       datetime.datetime(2009, 1, 5, 0, 0),
       datetime.datetime(2009, 1, 6, 0, 0),
       datetime.datetime(2009, 1, 7, 0, 0),
       datetime.datetime(2009, 1, 8, 0, 0)], dtype=object)

Note that NDFrames (such as Series and DataFrames) can only hold datetime-like objects as objects of dtype datetime64[ns]. The automatic conversion of all datetime-likes to a common dtype simplifies subsequent date computations. But it makes it impossible to store, say, Python datetime.datetime objects in a DataFrame column. Pandas core developer, Jeff Reback explains,

"We don't allow direct conversions because its simply too complicated to keep anything other than datetime64[ns] internally (nor necessary at all)."

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Thank you. This is getting me part of the way there, but I want this datetime to be a column of the original data frame. How do I do that if you are producing an array? – helloB May 20 '16 at 20:17
  • Pandas works very hard to ensure that datetime-like objects in a DataFrame are converted to `datetime64[ns]` dtype. This has the benefit of corralling disparate datetime-like objects into a single data type which is good for computations. But it does make it (I think) impossible to store datetime-like objects in a dtype other than `datetime64[ns]` in a DataFrame column. If you need to work with Python `datetime.datetime`s, you have to keep them in a variable outside the DataFrame. – unutbu May 20 '16 at 20:28
  • The same issue arose [here](http://stackoverflow.com/a/31918181/190597) -- the OP wanted a Series of dtype `datetime64[D]` instead of `datetime.datetime`s but it was impossible for essentially the same reason. – unutbu May 20 '16 at 21:13
  • Pandas core developer, [Jeff Reback says](https://github.com/pydata/pandas/issues/6741#issuecomment-39026803), "We don't allow direct conversions because its simply too complicated to keep anything other than datetime64[ns] internally (nor necessary at all)." – unutbu May 20 '16 at 21:16
  • This answer should be accepted as the correct one. I spent A LOT of time trying to figure out why I can't change the DataFrame's column type from datetime64[ns] to datetime.datetime. Thanks for this information! @unutbu – Serendipity Apr 04 '17 at 15:40
0

Without your data set, I have to guess at some things. But, you should be able to repeat the same thing as what you demonstrated as having worked.

dt['datetime'] = datetime.utcfromtimestamp(dt['time'].values.astype(int)/1000000000))
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • thanks for the suggestion, but this also produces an error: 'TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp' – helloB May 20 '16 at 19:45