1

I have a .txt file that contains the string "2020-08-13T20:41:15.4227628Z" What format code should I use in strptime function in Python 3.7? I tried the following but the '8' at end just before 'Z' is not a valid weekday

from datetime import datetime

timestamp_str = "2020-08-13T20:41:15.4227628Z"
timestamp = datetime.strptime(timestamp_str, '%Y-%m-%dT%H:%M:%S.%f%uZ')

ValueError: time data '2020-08-13T20:41:15.4227628Z' does not match format '%Y-%m-%dT%H:%M:%S.%f%uZ'

Francesco
  • 461
  • 5
  • 15
  • `%f` represents the number of *microseconds*, but you have 7 digits. `%u` is documented as a day-of-week specifier; it's not clear why you are using it to capture the 7th digit. What you need is some sort of *nanosecond* specifier, but I don't believe one exists. – chepner Sep 15 '20 at 19:19
  • This is actually from the metadata of a proprietary database file format written by Zeiss. It is saved by the microscope software in our lab, which I guess it is written in C++ so I thought it was some standard code. The way I have it now is to strip the uncoverted data that remains, but it must encode for some kind of information that "8Z" no? – Francesco Sep 15 '20 at 20:11
  • `Z` is a (nonstandard?) time zone indicator, equivalent to `GMT`. The 8 is part of the seconds value, not the time zone. – chepner Sep 15 '20 at 20:18
  • Ok cool. I actually have read other people asking about 7 digits after the seconds, it would be nice to be able to set the number of digits after the seconds in strptime format codes. Something like %.7f – Francesco Sep 15 '20 at 20:21
  • related: https://stackoverflow.com/a/63447899/10197418 – FObersteiner Sep 16 '20 at 05:53

2 Answers2

1

The 7 digits following the . appear to be a number of nanoseconds. You may have a platform-specific format (defined by strftime(3)) available to use in place of %f, but if not, your best bet is to drop the trailing digit before attempting to parse the remaining string as a timestamp.

regex = "(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{6}).(\d.*)"
if (m := re.match(regex, timestamp_str) is not None:
    timestamp_str = "".join(m.groups())

timestamp = datetime.strptime(timestamp_str, '%Y-%m-%dT%H:%M:%S.%fZ')
chepner
  • 497,756
  • 71
  • 530
  • 681
  • Thanks! For now I have a function that deals with the exception "unconverted data remains: " and it strips what was left uncoverted before retrying the conversion. However, as I said above that string is saved from a file saved by the Zeiss microscope software in our lab, which I guess it is written in C++ so I thought it was some standard code. My feeling is that "8Z" encodes for some kind of information, maybe a timezone? I might even ask Zeiss developers directly... – Francesco Sep 15 '20 at 20:14
  • 1
    a `datetime.fromisoformat(re.sub('[0-9]Z', '+00:00', timestamp_str))` should also do fine. – FObersteiner Sep 16 '20 at 05:56
1

your timestamp's format is mostly in accordance with ISO 8601, except for the 7 digit fractional seconds.

  • The 7th digit would be 1/10th of a microsecond; normally you'd have 3, 6 or 9 digits resolution (milli-, micro or nanoseconds respectively).
  • The Z denotes UTC

In Python, you can parse this format conveniently as I show here.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • 1
    Thanks, I actually have seen that discussion, but couldn't apply it to my case. Now I got it thanks! I wonder why Zeiss decided for 7 digits precision... – Francesco Sep 16 '20 at 06:55