85

Right now I have:

timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')

This works great unless I'm converting a string that doesn't have the microseconds. How can I specify that the microseconds are optional (and should be considered 0 if they aren't in the string)?

inye
  • 1,786
  • 1
  • 23
  • 31
Digant C Kasundra
  • 1,606
  • 2
  • 17
  • 27

9 Answers9

76

You could use a try/except block:

try:
    timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
except ValueError:
    timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S')
Alexander
  • 105,104
  • 32
  • 201
  • 196
  • 2
    It's a bit sad that this relies on [exceptions as control flow](https://www.reddit.com/r/Python/comments/ixkfqd/exceptions_as_control_flow/). – Joe Sadoski Jun 15 '22 at 13:50
  • 2
    @JoeSadoski I don't believe the solution above fits what is described in your linked article. – Alexander Jun 15 '22 at 15:33
  • 1
    Though it works, it is not the correct use of try/catch - there is no exceptional circmstance happening here. – developer Jan 25 '23 at 09:35
46

What about just appending it if it doesn't exist?

if '.' not in date_string:
    date_string = date_string + '.0'

timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
stevieb
  • 9,065
  • 3
  • 26
  • 36
  • 18
    This is a good answer, but I'm disappointed that a library designed to take the headache out of transforming date and time strings into date and time objects doesn't deal with these pretty simple use cases. The whole point of such a library is to remove the need to worry about this from the user. – Auspice May 02 '18 at 20:48
  • 2
    I greatly like this answer as opposed to using the try/catch – sniperd May 29 '19 at 20:09
10

I'm late to the party but I found if you don't care about the optional bits this will lop off the .%f for you.

datestring.split('.')[0]
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
user14608345
  • 121
  • 1
  • 3
9

I prefer using regex matches instead of try and except. This allows for many fallbacks of acceptable formats.

# full timestamp with milliseconds
match = re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z", date_string)
if match:
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%S.%fZ")

# timestamp missing milliseconds
match = re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z", date_string)
if match:
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%SZ")

# timestamp missing milliseconds & seconds
match = re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}Z", date_string)
if match:
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%MZ")

# unknown timestamp format
return false

Don't forget to import "re" as well as "datetime" for this method.

iforapsy
  • 302
  • 2
  • 5
  • I feel this answer deserves more votes - this is good programming logic - if you want to create a timestamp parser. – developer Jan 25 '23 at 09:38
2
datetime(*map(int, re.findall('\d+', date_string)))

can parse both '%Y-%m-%d %H:%M:%S.%f' and '%Y-%m-%d %H:%M:%S'. It is too permissive if your input is not filtered.

It is quick-and-dirty but sometimes strptime() is too slow. It can be used if you know that the input has the expected date format.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • This gives incorrect results if, in `date_string`, trailing zeros are omitted from microsecond part. – jez Aug 22 '16 at 19:07
  • 1
    @jez: yes, that is why I said it is "too permissive". It works only if the input has the expected format (none or 6 digits for microseconds). 2- about your edit: look at the question: `datetime` is the class here, not the module. – jfs Aug 22 '16 at 19:14
1

If you are using Pandas you can also filter the the Series and concatenate it. The index is automatically joined.

import pandas as pd

# Every other row has a different format
df = pd.DataFrame({"datetime_string": ["21-06-08 14:36:09", "21-06-08 14:36:09.50", "21-06-08 14:36:10", "21-06-08 14:36:10.50"]})
df["datetime"] = pd.concat([
    pd.to_datetime(df["datetime_string"].iloc[1::2], format="%y-%m-%d %H:%M:%S.%f"),
    pd.to_datetime(df["datetime_string"].iloc[::2], format="%y-%m-%d %H:%M:%S"),
])

datetime_string datetime
0 21-06-08 14:36:09 2021-06-08 14:36:09
1 21-06-08 14:36:09.50 2021-06-08 14:36:09.500000
2 21-06-08 14:36:10 2021-06-08 14:36:10
3 21-06-08 14:36:10.50 2021-06-08 14:36:10.500000
JulianWgs
  • 961
  • 1
  • 14
  • 25
0

For those wanting a more elegant catch all solution I created a small module so you don't have to. Accepts normal format strings but uses brackets functioning as optionals. Using some simple counter algorithm to capture the bracketed "optionals" and using the combinations library to build a list of possible combinations of the input date_format string.

It's not efficient at finding all the possible combinations so I would input and store the format list before running any intensive logic but it should be a nice catch all solution for anyone to use.

Some things to note, if you don't format the brackets properly, you'll get an ugly IndexError. Feel free to create your own exception for missing closing brackets if you feel the need. I also don't know that I adequately handled all nested bracket cases. I did test extensively though so I'm pretty sure this will cover all the bases. Of course if you want to use something other than [] brackets, I gave you some easy to change attribute variables.

from datetime import datetime
from itertools import combinations

opening_char = '['
closing_char = ']'

def parse_datetime(date_string, date_formats):
    for format_string in date_formats:
        try:
            parsed_date = datetime.strptime(date_string, format_string)
            return parsed_date
        except ValueError:
            continue
    
    print(f"Unable to parse date with any given format for string: {date_string}")
    return None

def _extract_optional_components(format_string):
    if opening_char in format_string:
        sub_strings = _get_bracketed_strings(format_string)
        
        for s in sub_strings:
            s.replace(opening_char, '')
            s.replace(closing_char, '')
        
        return sub_strings
    else:
        return []
                
def _get_bracketed_strings(input_string):
    sub_strings = []
    for i, char in enumerate(input_string):
        if char == opening_char:
            openpos = i
            closepos = openpos
            counter = 1
            while counter > 0:
                closepos += 1
                c = format_string[closepos]
                if c == opening_char:
                    counter += 1
                elif c == closing_char:
                    counter -= 1
            sub_strings.append(input_string[openpos + 1:closepos])
    return sub_strings

def _generate_date_formats(format_string):
    optional_components = _extract_optional_components(format_string)
    num_optionals = len(optional_components)
    
    all_combinations = []
    for r in range(num_optionals + 1):
        for combination in combinations(range(num_optionals), r):
            all_combinations.append(combination)

    output_formats = []
    for combination in all_combinations:
        new_format = format_string
        for i in range(num_optionals):
            if i in combination:
                new_format = new_format.replace(f'[{optional_components[i]}]', optional_components[i])
            else:
                new_format = new_format.replace(f'[{optional_components[i]}]', '')

        output_formats.append(new_format)

    return output_formats


if __name__ == "__main__":
    # Example usage
    format_string = "%Y-%m-%d[T%H:%M:%S[.%f]][Z]"
    optional_format_list = _generate_date_formats(format_string)

    date_string1 = "2023-06-16T03:09:23.155Z"
    date_string2 = "2023-06-16T02:53:18Z"
    date_string3 = "2023-06-16"

    datetime_obj1 = parse_datetime(date_string1, optional_format_list)
    datetime_obj2 = parse_datetime(date_string2, optional_format_list)
    datetime_obj3 = parse_datetime(date_string3, optional_format_list)

    print(datetime_obj1)  # 2023-06-16 03:09:23.155000+00:00
    print(datetime_obj2)  # 2023-06-16 02:53:18+00:00
    print(datetime_obj3)  # 2023-06-16 00:00:00+00:00
-2

using one regular expression and some list expressions

time_str = "12:34.567"
# time format is [HH:]MM:SS[.FFF]
sum([a*b for a,b in zip(map(lambda x: int(x) if x else 0, re.match(r"(?:(\d{2}):)?(\d{2}):(\d{2})(?:\.(\d{3}))?", time_str).groups()), [3600, 60, 1, 1/1000])])
# result = 754.567
milahu
  • 2,447
  • 1
  • 18
  • 25
  • 1
    I guess it's neither readable, nor useful... You are reinventing the wheel here fully instead of using `strptime()` and solving the problem of optional microseconds.. – hnwoh Jun 02 '23 at 13:45
-3

For my similar problem using jq I used the following:

|split("Z")[0]|split(".")[0]|strptime("%Y-%m-%dT%H:%M:%S")|mktime

As the solution to sort my list by time properly.