10

In the format I am given, the date 2014-01-02 would be represented by "20140102". This is correctly parsed with the standard strptime:

>>> datetime.datetime.strptime("20140102", "%Y%m%d")
datetime.datetime(2014, 1, 2, 0, 0)

In this format, "201412" would not be a valid date. The docs say that the "%m" directive is "Month as a zero-padded decimal number." It gives as examples "01, 02, ..., 12". The days directive "%d" is also supposed to be zero-padded.

Based on this, I expected that "201412" would be an invalid input with this format, so would raise a ValueError. Instead, it is interpreted as 2014-01-02:

>>> datetime.datetime.strptime("201412", "%Y%m%d")
datetime.datetime(2014, 1, 2, 0, 0)

The question is: is there a way to specify "no seriously zero-padded only"? Or am I misunderstanding the term "zero-padded" in this context?

Note that the question is not about how to parse dates in this format, but about understanding the behavior of strptime.

user2957943
  • 173
  • 2
  • 8
  • This doesn't really help you, but it is somewhat related (think `/` separated fields) [Parsing non-zero padded timestamp in python](http://stackoverflow.com/questions/25279993/parsing-non-zero-padded-timestamps-in-python) – metatoaster Sep 16 '16 at 05:04
  • i think it depends on the regular expression the function uses – kiviak Sep 16 '16 at 05:29
  • Looks like some additional explanation in the Python docs would be nice. Is it guaranteed, that `strptime("2014123", "%Y%m%d")` will always give `datetime(2014, 12, 3, 0, 0)` or could it be `datetime(2014, 1, 23, 0, 0)`? – Matthias Sep 16 '16 at 06:35
  • Note that the documentation does **not** say that `%m` values must be zero-padded, it just says that it represents "month as a decimal number" and provides an example to show that it *can* be zero-padded. – Bakuriu Sep 16 '16 at 08:24

3 Answers3

6

If you look here at how the regex is defined for %m https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7f1b/Lib/_strptime.py#L204

'm': r"(?P<m>1[0-2]|0[1-9]|[1-9])"

You can see you can either have a 10-12, 01-09, or 1-9 as acceptable months.

postelrich
  • 3,274
  • 5
  • 38
  • 65
6

According to the related issue on the Python tracker, with the example being like such (a bit of a modification to this question, however the concept is the exact same):

>>> datetime.datetime.strptime('20141110', '%Y%m%d').isoformat()
'2014-11-10T00:00:00'
>>> datetime.datetime.strptime('20141110', '%Y%m%d%H%M').isoformat()
'2014-01-01T01:00:00'

The above behavior is determined to be not a bug as explained by this comment which states that they conform to the OpenGroup strptime standard which specifies that "leading zeros are permitted but not required.".

I guess the workaround is to use regex or check that the length of the string is of length 8 before passing into strptime.

metatoaster
  • 17,419
  • 5
  • 55
  • 66
1

This is pretty tricky, but it sounds like strptime just tries to match the string as closely as possible. Python's strptime is the same as C's strptime, and the docs say that padding is optional:

is the month number [1,12]; leading zeros are permitted but not required.

http://pubs.opengroup.org/onlinepubs/7908799/xsh/strptime.html

denvaar
  • 2,174
  • 2
  • 22
  • 26