1

I have a df in Python with Dtype of df['timestamp'] as object

The actual data in the timestamp column consists of hourly date times:

01012011 0:00 
01012011 1:00  
01012011 2:00  
... 
01012011 22:00  
01012011 23:00  
01012011 24:00  
... 

etc.,

I want to convert the Dtype to datetime64 and tried df_1['timestamp'] = pd.to_datetime(df_1['timestamp']) but received the following error ParserError: month must be in 1..12: 01012011 0:00. I believe there is a simple solution but haven't managed to figure it out yet.

yungmist
  • 13
  • 5
  • 1
    You'll need to set the format for pd.to_datetime explicitly: `format='%d%m%Y %H:%S'` (make sure the order of %d and %m is correct; adjust if needed). – FObersteiner Apr 09 '21 at 17:13
  • Thank you, is it strange that ```pd.to_datetime('01012011 00:00',format='%d%m%Y %H:%M')``` works, whereas ```pd.to_datetime('06162013 19:00',format='%d%m%Y %H:%M')``` prompts ```ValueError: time data '06162013 19:00' does not match format '%d/%m/%Y %H:%M' (match)```. I think I am not understanding something. – yungmist Apr 09 '21 at 18:04
  • as I noted in parentheses: make sure the order of day and month is correct. `'06162013 19:00'` is month first, so format is `'%m%d%Y %H:%S'`. – FObersteiner Apr 09 '21 at 18:08
  • Wow, sorry about that. Thank you – yungmist Apr 09 '21 at 18:10
  • no worries, glad it helped. https://strftime.org/ is a lucid resource to get the formatting directives – FObersteiner Apr 09 '21 at 18:13

1 Answers1

0

EDIT: As suggested my MrFuppes, you can try the following:

import pandas as pd
from datetime import datetime
functools import partial
df = pd.DataFrame({'Time':['01012011 0:00 ']},index=['tst'])
to_datetime_fmt = partial(pd.to_datetime, format='%d/%m/%y %H:%M')
df['timestamp'] = df['timestamp'].apply(to_datetime_fmt)

anisoleanime
  • 409
  • 3
  • 10
  • The OP has a pandas DataFrame; you'll want to use pandas methods with that (see my comment on the question). – FObersteiner Apr 09 '21 at 17:15
  • Thank you @roshshetty, I tried both ```%H:%M``` and ```%k:%M``` for the format, and both returned ```NaT``` in the timestamp column. I believe ```%k:%M``` is the correct format since it is 24 hour without leading zeros, but I don't see why that isn't working. – yungmist Apr 09 '21 at 17:48