I want to generate time/date format strings from the input data I got. Is there an easy way to do this?
My input data looks like this:
'01.12.2016 23:30:59,123'
So my code should generate the following format string:
'%d.%m.%Y %H:%M:%S,%f'
Background:
I used pandas.to_datetime()
to generate datetime object for further processing. This works great but this function gets slow (uses dateutil.parser.parse
here) with a lot of data (>~50k). At the moment I'm providing the format string above hardcoded within my code to speed up to_datetime()
which also works great. Now I wanted to generate the format string within code to be more flexible regaring the input data.
edit (because the first two answers do not fit to my question):
I want to generate the format string not the datetime string.
edit2:
New approch to formulate the question: I'm reading in a file with a lot of data. Every line of data has got a timestamp with the following format: '01.12.2016 23:30:59,123'. I want to convert these timestamps into datetime objects. For this I'm using pandas.to_datetime() at the moment. This function works perfectly but it get slow since I got some files with over 50k datasets. To speed this process up I'm passing a format string within the function pandas.to_datetime(format='%d.%m.%Y %H:%M:%S,%f'). This speeds up the process but it is less flexible. Therefore I want to evaluate the format string only for the first dataset and use it for the rest of the 50k or more datasets.
How is this possible?