Python: How can I convert string to datetime without knowing the format?

Question

I have a field that comes in as a string and represents a time. Sometimes its in 12 hour, sometimes in 24 hour. Possible values:

8:26
08:26am
13:27

Is there a function that will convert these to time format by being smart about it? Option 1 doesn't have am because its in 24 hour format, while option 2 has a 0 before it and option 3 is obviously in 24 hour format. Is there a function in Python/ a lib that does:

time = func(str_time)

related: [Converting string into datetime](http://stackoverflow.com/q/466345/4279) — jfs, Jun 26 '15 at 13:17
+1 for the specific focus on "format not known" (i.e. confusion between e.g. dd/mm and mm/dd is not a concern). If it was known, `dateutil` would be an unreliable choice. — ivan_pozdeev, Jun 26 '15 at 16:51

score 42 · Accepted Answer · answered Jun 26 '15 at 06:59

42

super short answer:

from dateutil import parser
parser.parse("8:36pm")
>>>datetime.datetime(2015, 6, 26, 20, 36)
parser.parse("18:36")
>>>datetime.datetime(2015, 6, 26, 18, 36)

Dateutil should be available for your python installation; no need for something large like pandas

If you want to extract the time from the datetime object:

t = parser.parse("18:36").time()

which will give you a time object (if that's of more help to you). Or you can extract individual fields:

dt = parser.parse("18:36")
hours = dt.hour
minute = dt.minute

answered Jun 26 '15 at 06:59

Marcus Müller

34,677
4
53
94

Can I do this without dateutil? Problem is I'm running on Google App Engine, and using libraries outside of Python STL is an issue. – Debnath Sinha Jun 26 '15 at 07:25
@DebnathSinha: Python STL? what's that? Also, if you know there's only three types of things and don't want to use an external library (although your question **specifically** asked for that), write the parser yourself. It's really not hard with `string.split(":")` and the likes. – Marcus Müller Jun 26 '15 at 07:28
Ur right, my bad, I had mentioned a library was ok. STL isn't the right term for Python, was borrowing from my C++ days, I meant standard library. The issue is that App Engine doesn't allow us to install any library, but only use the standard library. Is dateutil all Python code (no C)? If so, I might be able to include it from source in my source code tree rather than having to install it. Think that might work. Thanks! – Debnath Sinha Jun 26 '15 at 07:56
@DebnathSinha: I don't think dateutil is pure python. BTW, STL is only nearly the right term from your C++ days :) There's a nice article http://stackoverflow.com/questions/5205491/whats-this-stl-vs-c-standard-library-fight-all-about/5205571#5205571 – Marcus Müller Jun 26 '15 at 08:13
@DebnathSinha: Might be that dateutil is python only, http://bazaar.launchpad.net/~dateutil/dateutil/trunk/files – Marcus Müller Jun 26 '15 at 08:25
@DebnathSinha: the term that is used in Python documentation is "stdlib". – jfs Jun 26 '15 at 13:16
@DebnathSinha Remember that a python module is just a set of files. If you can't install them the normal way, you can just copy them somewhere and make sure that location is in `PYTHONPATH`/`sys.path`. In fact, this was the case with Python itself before [`setuptools` reared its ugly head ;-)](https://mail.python.org/pipermail/python-dev/2008-March/077964.html) – ivan_pozdeev Jun 26 '15 at 16:22
@ivan_pozdeev: not true, there's python modules that also consist of compiled code that links against cpython. – Marcus Müller Jun 26 '15 at 20:43
@MarcusMüllerꕺꕺ same for compiled code, just make sure you've got the version compiled against the same runtime as the one you use. Another matter is it's not true _in this particular case_ since [Google App Engine phohibits C extensions](https://cloud.google.com/appengine/docs/python/). – ivan_pozdeev Jun 28 '15 at 01:30
@ivan_pozdeev: I agree, still just a bunch of files, but "compiled against the same runtime" is not strong enough: it needs to be compiled against the same runtime, the same libc, and all the other libraries that are used on the target platform. So it's "a set of files", containing so much information about the environment it's about to be deployed in, the most likely way is installation through the same mechanism other software packages are installed on the target platform, or directly linking against all the libraries on the target. – Marcus Müller Jun 28 '15 at 20:13
Hi, I've found this answer trying to solve the same problem, but can't get it to work. After using `from dateutil import parser`, Visual Studio 2017 says "No module named 'dateutil'". Any idea how to import this module? – Lou Jun 15 '20 at 09:18
@Lou although unusual, your installation might not have the dateutil module. Can't tell you how to install it on your platform, because I don't know your platform. – Marcus Müller Jun 15 '20 at 12:56
You have to install dateutil. It does not come with your python installation. `python -m pip install python-dateutil` – Noah May May 06 '23 at 02:19

score 15 · Answer 2 · answered Jun 26 '15 at 06:58

15

there is one such function in pandas

import pandas as pd
d = pd.to_datetime('<date_string>')

answered Jun 26 '15 at 06:58

sachin saxena

926
5
18

Eric Zhang · Answer 3 · 2022-11-16T02:51:55.043

Using regex to cut string into ['year', 'month', 'day', 'hour', 'minutes', 'seconds'] then unpack it and fill into datetime class datetime.datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0) , this is the fastest way I tested so far.

    import re
    import pandas as pd
    import datetime
    import timeit

    def date2timestamp_anyformat(format_date):
        numbers = ''.join(re.findall(r'\d+', format_date))
        if len(numbers) == 8:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]))
        elif len(numbers) == 14:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]), int(numbers[8:10]), int(numbers[10:12]), int(numbers[12:14]))
        elif len(numbers) > 14:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]), int(numbers[8:10]), int(numbers[10:12]), int(numbers[12:14]), microsecond=1000*int(numbers[14:]))
        else:
            raise AssertionError(f'length not match:{format_date}')
        return d.timestamp()

and speed test:

    print('regex cut:\n',timeit.timeit(lambda: datetime.datetime(*map(int, re.split('-|:|\s', '2022-08-13 12:23:44.234')[:-1])).timestamp(), number=10000))
    print('pandas to_datetime:\n', timeit.timeit(lambda: pd.to_datetime('2022-08-13 12:23:44.234').timestamp(), number=10000))
    print('datetime with known format:\n',timeit.timeit(lambda: datetime.datetime.strptime('2022-08-13 12:23:44.234', '%Y-%m-%d %H:%M:%S.%f').timestamp(), number=10000))
    print('regex get number first:\n',timeit.timeit(lambda: date2timestamp_anyformat('2022-08-13 12:23:44.234'), number=10000))
    print('dateutil parse:\n', timeit.timeit(lambda: parser.parse('2022-08-13 12:23:44.234').timestamp(), number=10000))

result:

regex cut:
 0.040550945326685905
pandas to_datetime:
 0.8012433210387826
datetime with known format:
 0.09105705469846725
regex get number first:
 0.04557646345347166
dateutil parse:
 0.6404162347316742

Python: How can I convert string to datetime without knowing the format?

3 Answers3

Linked