28

I have a field that comes in as a string and represents a time. Sometimes its in 12 hour, sometimes in 24 hour. Possible values:

  1. 8:26
  2. 08:26am
  3. 13:27

Is there a function that will convert these to time format by being smart about it? Option 1 doesn't have am because its in 24 hour format, while option 2 has a 0 before it and option 3 is obviously in 24 hour format. Is there a function in Python/ a lib that does:

time = func(str_time)
Mazdak
  • 105,000
  • 18
  • 159
  • 188
Debnath Sinha
  • 1,087
  • 1
  • 12
  • 25
  • related: [Converting string into datetime](http://stackoverflow.com/q/466345/4279) – jfs Jun 26 '15 at 13:17
  • +1 for the specific focus on "format not known" (i.e. confusion between e.g. dd/mm and mm/dd is not a concern). If it was known, `dateutil` would be an unreliable choice. – ivan_pozdeev Jun 26 '15 at 16:51

3 Answers3

42

super short answer:

from dateutil import parser
parser.parse("8:36pm")
>>>datetime.datetime(2015, 6, 26, 20, 36)
parser.parse("18:36")
>>>datetime.datetime(2015, 6, 26, 18, 36)

Dateutil should be available for your python installation; no need for something large like pandas

If you want to extract the time from the datetime object:

t = parser.parse("18:36").time()

which will give you a time object (if that's of more help to you). Or you can extract individual fields:

dt = parser.parse("18:36")
hours = dt.hour
minute = dt.minute
Marcus Müller
  • 34,677
  • 4
  • 53
  • 94
  • Can I do this without dateutil? Problem is I'm running on Google App Engine, and using libraries outside of Python STL is an issue. – Debnath Sinha Jun 26 '15 at 07:25
  • @DebnathSinha: Python STL? what's that? Also, if you know there's only three types of things and don't want to use an external library (although your question **specifically** asked for that), write the parser yourself. It's really not hard with `string.split(":")` and the likes. – Marcus Müller Jun 26 '15 at 07:28
  • Ur right, my bad, I had mentioned a library was ok. STL isn't the right term for Python, was borrowing from my C++ days, I meant standard library. The issue is that App Engine doesn't allow us to install any library, but only use the standard library. Is dateutil all Python code (no C)? If so, I might be able to include it from source in my source code tree rather than having to install it. Think that might work. Thanks! – Debnath Sinha Jun 26 '15 at 07:56
  • @DebnathSinha: I don't think dateutil is pure python. BTW, STL is only nearly the right term from your C++ days :) There's a nice article http://stackoverflow.com/questions/5205491/whats-this-stl-vs-c-standard-library-fight-all-about/5205571#5205571 – Marcus Müller Jun 26 '15 at 08:13
  • @DebnathSinha: Might be that dateutil is python only, http://bazaar.launchpad.net/~dateutil/dateutil/trunk/files – Marcus Müller Jun 26 '15 at 08:25
  • @DebnathSinha: the term that is used in Python documentation is "stdlib". – jfs Jun 26 '15 at 13:16
  • @DebnathSinha Remember that a python module is just a set of files. If you can't install them the normal way, you can just copy them somewhere and make sure that location is in `PYTHONPATH`/`sys.path`. In fact, this was the case with Python itself before [`setuptools` reared its ugly head ;-)](https://mail.python.org/pipermail/python-dev/2008-March/077964.html) – ivan_pozdeev Jun 26 '15 at 16:22
  • @ivan_pozdeev: not true, there's python modules that also consist of compiled code that links against cpython. – Marcus Müller Jun 26 '15 at 20:43
  • @MarcusMüllerꕺꕺ same for compiled code, just make sure you've got the version compiled against the same runtime as the one you use. Another matter is it's not true _in this particular case_ since [Google App Engine phohibits C extensions](https://cloud.google.com/appengine/docs/python/). – ivan_pozdeev Jun 28 '15 at 01:30
  • @ivan_pozdeev: I agree, still just a bunch of files, but "compiled against the same runtime" is not strong enough: it needs to be compiled against the same runtime, the same libc, and all the other libraries that are used on the target platform. So it's "a set of files", containing so much information about the environment it's about to be deployed in, the most likely way is installation through the same mechanism other software packages are installed on the target platform, or directly linking against all the libraries on the target. – Marcus Müller Jun 28 '15 at 20:13
  • Hi, I've found this answer trying to solve the same problem, but can't get it to work. After using `from dateutil import parser`, Visual Studio 2017 says "No module named 'dateutil'". Any idea how to import this module? – Lou Jun 15 '20 at 09:18
  • @Lou although unusual, your installation might not have the dateutil module. Can't tell you how to install it on your platform, because I don't know your platform. – Marcus Müller Jun 15 '20 at 12:56
  • You have to install dateutil. It does not come with your python installation. `python -m pip install python-dateutil` – Noah May May 06 '23 at 02:19
15

there is one such function in pandas

import pandas as pd
d = pd.to_datetime('<date_string>')
sachin saxena
  • 926
  • 5
  • 18
0

Using regex to cut string into ['year', 'month', 'day', 'hour', 'minutes', 'seconds'] then unpack it and fill into datetime class datetime.datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0) , this is the fastest way I tested so far.

    import re
    import pandas as pd
    import datetime
    import timeit

    def date2timestamp_anyformat(format_date):
        numbers = ''.join(re.findall(r'\d+', format_date))
        if len(numbers) == 8:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]))
        elif len(numbers) == 14:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]), int(numbers[8:10]), int(numbers[10:12]), int(numbers[12:14]))
        elif len(numbers) > 14:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]), int(numbers[8:10]), int(numbers[10:12]), int(numbers[12:14]), microsecond=1000*int(numbers[14:]))
        else:
            raise AssertionError(f'length not match:{format_date}')
        return d.timestamp()

and speed test:

    print('regex cut:\n',timeit.timeit(lambda: datetime.datetime(*map(int, re.split('-|:|\s', '2022-08-13 12:23:44.234')[:-1])).timestamp(), number=10000))
    print('pandas to_datetime:\n', timeit.timeit(lambda: pd.to_datetime('2022-08-13 12:23:44.234').timestamp(), number=10000))
    print('datetime with known format:\n',timeit.timeit(lambda: datetime.datetime.strptime('2022-08-13 12:23:44.234', '%Y-%m-%d %H:%M:%S.%f').timestamp(), number=10000))
    print('regex get number first:\n',timeit.timeit(lambda: date2timestamp_anyformat('2022-08-13 12:23:44.234'), number=10000))
    print('dateutil parse:\n', timeit.timeit(lambda: parser.parse('2022-08-13 12:23:44.234').timestamp(), number=10000))

result:

regex cut:
 0.040550945326685905
pandas to_datetime:
 0.8012433210387826
datetime with known format:
 0.09105705469846725
regex get number first:
 0.04557646345347166
dateutil parse:
 0.6404162347316742
Eric Zhang
  • 119
  • 1
  • 5