3

I have this dataframe:

7542  08/01/2015 00:09:19
7543  08/01/2015 01:01:07
7544  08/01/2015 01:05:42
7545  08/01/2015 01:53:52
7546  08/01/2015 02:28:56

I want to convert it to datetime format, but when I use pd.to_datetime I get this error:

ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing

I've tried pd.to_datetime(dataf.stack()).unstack() and I got a NaT.

Rohan Pillai
  • 917
  • 3
  • 17
  • 26
Carolina
  • 45
  • 1
  • 5

3 Answers3

2

After some playing around, I figured an easy way to do it:

sample_df = pd.DataFrame({'datetime':['08/01/2015 00:09:19', '08/01/2015 01:01:07', '08/01/2015 01:05:42']})
pd.to_datetime(sample_df.datetime, infer_datetime_format=True)

Yeah, it's that simple. I think to_datetime needs the data to be in a certain format, and passing the specific column of the dataframe helps it get that format.

Jordan A.
  • 384
  • 1
  • 4
  • 17
1

In addition to infer_datetime_format=True, you can also pass a format explicitly:

>>> pd.to_datetime(sample_df['datetime'], format='%m/%d/%Y %H:%M:%S') 
0   2015-08-01 00:09:19
1   2015-08-01 01:01:07
2   2015-08-01 01:05:42
Name: datetime, dtype: datetime64[ns]

I asked a question about this a little while back and there's a great answer from @AlexRiley: Inferring date format versus passing a parser.

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
0

Error means your code select DataFrame, so pd.to_datetime looking for year, month and day columns:

print (pd.to_datetime(dataf))

ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing


Solution is select column date with parameter dayfirst=True if first number is day in to_datetime:

dataf['date'] = pd.to_datetime(dataf['date'], dayfirst=True)
print (dataf)
                    date
7542 2015-01-08 00:09:19
7543 2015-01-08 01:01:07
7544 2015-01-08 01:05:42
7545 2015-01-08 01:53:52
7546 2015-01-08 02:28:56

print (dataf.dtypes)
date    datetime64[ns]
dtype: object

Setup:

d = {'date': {7542: '08/01/2015 00:09:19', 7543: '08/01/2015 01:01:07', 7544: '08/01/2015 01:05:42', 7545: '08/01/2015 01:53:52', 7546: '08/01/2015 02:28:56'}}
dataf = pd.DataFrame(d)
print (dataf)
                     date
7542  08/01/2015 00:09:19
7543  08/01/2015 01:01:07
7544  08/01/2015 01:05:42
7545  08/01/2015 01:53:52
7546  08/01/2015 02:28:56
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252