I have a dataframe with date column having string format as follows: 20180406T165358.
Now I'm trying to parse it with to_datetime(). So my format argument at to_datetime() should be format='%Y%m%dT%H%M%S'
but format='%Y-%m-%dT%H:%M:%S'
also works. So my question is: what is the role of those specific symbols '-' and ':' in parsing date?
Asked
Active
Viewed 175 times
0

Alex
- 11
- 1
-
`format='%Y-%m-%dT%H:%M:%S'` is actually wrong for your input. pandas is programmed clever enough to just ignore it. no special meaning of '-' and ':'. – FObersteiner Feb 05 '21 at 09:36
1 Answers
0
It's an interesting find. As per docs pandas uses strptime()
and strftime()
on the back-end, but the behaviour of those is strangely different from those in datetime module.
While Pandas happily accepts both, actual datetime.datetime.strptime()
fails on the second version of the format.
z = '20180406T165358'
dt.datetime.strptime(z, '%Y%m%dT%H%M%S')
Out[39]: datetime.datetime(2018, 4, 6, 16, 53, 58)
dt.datetime.strptime(z, '%Y-%m-%dT%H:%M:%S')
Traceback (most recent call last):
File "<ipython-input-40-c37caf368af3>", line 1, in <module>
dt.datetime.strptime(z, '%Y-%m-%dT%H:%M:%S')
File "C:\ProgramData\Anaconda3\lib\_strptime.py", line 565, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "C:\ProgramData\Anaconda3\lib\_strptime.py", line 362, in _strptime
(data_string, format))
ValueError: time data '20180406T165358' does not match format '%Y-%m-%dT%H:%M:%S'
pd.to_datetime([z], format='%Y%m%dT%H%M%S')
Out[41]: DatetimeIndex(['2018-04-06 16:53:58'], dtype='datetime64[ns]', freq=None)
pd.to_datetime([z], format='%Y-%m-%dT%H:%M:%S')
Out[42]: DatetimeIndex(['2018-04-06 16:53:58'], dtype='datetime64[ns]', freq=None)
It appears that in pandas special symbols just get ignored.

NotAName
- 3,821
- 2
- 29
- 44
-
pandas just ignores the wrong format directive (`'%Y-%m-%dT%H:%M:%S'` != `'%Y%m%dT%H%M%S'`) - which `strptime` doesn't. seems like "convenient" vs. "deterministic"... btw. for format '%Y-%m-%dT%H:%M:%S', use [fromisoformat](https://docs.python.org/3/library/datetime.html#datetime.datetime.fromisoformat), it is [more efficient](https://stackoverflow.com/a/61710371/10197418) ;-) – FObersteiner Feb 05 '21 at 09:34