5

In my pandas DataFrame, I have some date values which I converted from a timestamp to datetime, using the datetime module. Printing out the DataFrame looks good, but when I convert the DataFrame to a dictionary using to_dict(), the datetime values appear to be of the pandas Timestamp type.

My question is similar to this question, but I have no pandas Timestamp object at the beginning and I want to keep the datetime I created in the first place. The proposed solutions did not work for me either, as the output was still a pandas Timestamp, but of dtype object.

Input:

import pandas as pd
from datetime import datetime

list_in = [
    {"name": "a", "created": 1574947970000}, 
    {"name": "b", "created": 1581322065000}, 
    {"name": "c", "created": 1578929177000}
]

Then I create a DataFrame and convert each timestamp to a datetime format. (Division by 1000 is necessary, because the timestamp is in milliseconds and should be seconds)

df = pd.DataFrame(list_in)
df["created"] = [datetime.fromtimestamp(x / 1000) for x in df["created"]]
print(df)

Output:

  name             created
0    a 2019-11-28 14:32:50
1    b 2020-02-10 09:07:45
2    c 2020-01-13 16:26:17
name               object
created    datetime64[ns]
dtype: object

This output shows the correct date values as datetime objects. In my project I do some other magic on the DataFrame which is irrelevant for this topic. But at the end I want to convert the DataFrame back to a list of dictionaries like this:

list_out = df.to_dict(orient="records")

Final output:

[{'name': 'a', 'created': Timestamp('2019-11-28 14:32:50')}, 
{'name': 'b', 'created': Timestamp('2020-02-10 09:07:45')}, 
{'name': 'c', 'created': Timestamp('2020-01-13 16:26:17')}]

Final desired output:

[{'name': 'a', 'created': datetime.datetime(2019, 11, 28, 14, 32, 50)}, 
{'name': 'b', 'created': datetime.datetime(2020, 2, 10, 9, 7, 45)}, 
{'name': 'c', 'created': datetime.datetime(2020, 1, 13, 16, 26, 17)}]

So my question is actually, why does the to_dict() return the datetime values as pandas Timestamp objects? And how can I prevent that from happening?

Andreas
  • 53
  • 1
  • 4

1 Answers1

4

The why can be found here.

I don't know how to prevent it from happening but you could convert the Timestamps to datetime64 aftwards:

for rec in list_out:
    rec['created'] = rec['created'].to_datetime64()
Martin
  • 199
  • 2
  • 6
  • Thank you @Martin. I used `.to_pydatetime()` instead, as your solution returns a numpy.datetime64, instead of datetime.datetime – Andreas Oct 02 '20 at 12:52
  • Hi @Martin, I am not sure to understand how the Timestamp doc page explains "why" that happens. Any insight on that ? – pierresegonne Sep 10 '21 at 17:16
  • 1
    I think I was referring to "Timestamp is the pandas equivalent of python’s Datetime and is interchangeable with it in most cases." But that doesn't really explain why it's converted by `to_dict`. I don't really remember – Martin Sep 11 '21 at 18:49
  • Ok, thanks for the answer, your proposition does work correctly, but if someone finds a more elegant answer, I would love to hear it! – pierresegonne Sep 12 '21 at 19:38