Fastest way to parse a column to datetime in pandas

Question

I have the following dataframe with more than 400 000 lines.

df = pd.DataFrame({'date' : ['03/02/2015 23:00',
'03/02/2015 23:30',
'04/02/2015 00:00',
'04/02/2015 00:30',
'04/02/2015 01:00',
'04/02/2015 01:30',
'04/02/2015 02:00',
'04/02/2015 02:30',
'04/02/2015 03:00',
'04/02/2015 03:30',
'04/02/2015 04:00',
'04/02/2015 04:30',
'04/02/2015 05:00',
'04/02/2015 05:30',
'04/02/2015 06:00',
'04/02/2015 06:30',
'04/02/2015 07:00']})

I am trying to parse the date column of a csv file in pandas as fast as possible. I know how to do it with read_csv but that takes a lot of time! Also, I have tried the following which works but which is also very slow: df['dateTimeFormat'] = pd.to_datetime(df['date'],dayfirst=True)

How could I parse efficiently and in a really fast way the date column to datetime?

Thank you very much for your help,

Pierre

score 10 · Accepted Answer · answered May 24 '18 at 10:56

10

You can define format of datetimes by http://strftime.org/:

df = pd.concat([df] * 1000, ignore_index=True)


%timeit df['dateTimeFormat1'] = pd.to_datetime(df['date'],dayfirst=True)
2.94 s ± 285 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit df['dateTimeFormat2'] = pd.to_datetime(df['date'],format='%d/%m/%Y %H:%M') 
55 ms ± 1.47 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

answered May 24 '18 at 10:56

jezrael

822,522
95
1,334
1,252

@Peslier53 - Good news, glad can help you! – jezrael May 24 '18 at 11:03
Is there a way to integrate this parsing method directly in the read_csv function and assigning this to the index? – Peslier53 May 24 '18 at 11:09
@Peslier53 - Please give me some time. – jezrael May 24 '18 at 11:15
@Peslier53 - I find solution - [this](https://stackoverflow.com/a/28863254), but not sure about performance. – jezrael May 24 '18 at 11:20
"%z +0000 UTC offset in the form ±HHMM[SS[.ffffff]] (empty string if the object is naive)." But my pandas-produced file has the TZ offset in the format ±HH:MM. How do I specify that? – Ben Farmer May 24 '23 at 01:45

Fastest way to parse a column to datetime in pandas

1 Answers1

Linked