17

Well this is embarrassing... I'm trying to create a good reproducible pandas example by giving you guys a small sample of my dataset. I thought this would be simple with df.to_dict() but to no avail.

df2 = df1[['DATE_FILLED','DAYS_SUPPLY']].head(5)
df2['DATE_FILLED'] = pd.to_datetime(df2['DATE_FILLED'])
diction = df2.to_dict()

output:

{'DATE_FILLED': {0: Timestamp('2016-12-28 00:00:00'),
                 1: Timestamp('2016-12-31 00:00:00'), 
                 2: Timestamp('2016-12-20 00:00:00'), 
                 3: Timestamp('2016-12-21 00:00:00'), 
                 4: Timestamp('2016-12-26 00:00:00')}, 
     'DAYS_SUPPLY': {0: 14, 1: 14, 2: 14, 3: 7, 4: 7}}

But if the community were to convert it to a dataframe by using the text:

import pandas as pd
from datetime import datetime
import time
d= pd.DataFrame({'DATE_FILLED': [Timestamp('2016-12-28 00:00:00'), Timestamp('2016-12-31 00:00:00'), Timestamp('2016-12-20 00:00:00'), Timestamp('2016-12-21 00:00:00'), Timestamp('2016-12-26 00:00:00')], 'DAYS_SUPPLY': [14, 14, 14, 7, 7]})

They would get NameError: name 'Timestamp' is not defined. I've tried importing various things and even tried playing around with the different orients in pd.to_dict().

How do I either convert the Timestamps or better yet, create a DataFrame from them?

Community
  • 1
  • 1
MattR
  • 4,887
  • 9
  • 40
  • 67

3 Answers3

29

You need to import Timestamp from pandas:

>>> import pandas as pd
>>> from pandas import Timestamp
>>> d= pd.DataFrame({'DATE_FILLED': [Timestamp('2016-12-28 00:00:00'), Timestamp('2016-12-31 00:00:00'), Timestamp('2016-12-20 00:00:00'), Timestamp('2016-12-21 00:00:00'), Timestamp('2016-12-26 00:00:00')], 'DAYS_SUPPLY': [14, 14, 14, 7, 7]})
>>>
>>> d
  DATE_FILLED  DAYS_SUPPLY
0  2016-12-28           14
1  2016-12-31           14
2  2016-12-20           14
3  2016-12-21            7
4  2016-12-26            7
>>>

In the future, you can always use introspection to give you a good hint:

>>> ts = d.to_dict()['DATE_FILLED'][0]
>>> type(ts)
<class 'pandas.tslib.Timestamp'>
>>> from pandas.tslib import Timestamp
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
8

You just need to import Timestamp:

from pandas import Timestamp

d = {'DATE_FILLED': {0: Timestamp('2016-12-28 00:00:00'),
                 1: Timestamp('2016-12-31 00:00:00'), 
                 2: Timestamp('2016-12-20 00:00:00'), 
                 3: Timestamp('2016-12-21 00:00:00'), 
                 4: Timestamp('2016-12-26 00:00:00')}, 
     'DAYS_SUPPLY': {0: 14, 1: 14, 2: 14, 3: 7, 4: 7}}



pd.DataFrame(d)
Out: 
  DATE_FILLED  DAYS_SUPPLY
0  2016-12-28           14
1  2016-12-31           14
2  2016-12-20           14
3  2016-12-21            7
4  2016-12-26            7
ayhan
  • 70,170
  • 20
  • 182
  • 203
2

import module doesn't enter the module's names into the global namespace, you have to access them via module.name. To enter the module's names into the global namespace, you need to use the from module import syntax. In this case, either from pandas import Timestamps, which enters Timestamps into the global namespace, or from pandas import *, which imports all of the names in pandas into the global namespace.

Denziloe
  • 7,473
  • 3
  • 24
  • 34