I read in a CSV file containing dates. Some dates may be formatted wrong and I want to find those. With the following approach I would expect the 2nd row to be NaT
. But pandas seems to ignore the specified format no matter if I set infer_datetime_format
or exact
.
import pandas as pd
from io import StringIO
DATA = StringIO("""date
2019 10 07
2018 10
""")
df = pd.read_csv(DATA)
df['date'] = pd.to_datetime(df['date'], format="%Y %m %d", errors='coerce', exact=True)
results in
date
0 2019-10-07
1 2018-10-01
The pandas.to_datetime documentation refers to strftime() and strptime() Behavior but when I test it with plain Python it works:
datetime.datetime.strptime(' 2018 10', '%Y %m %d')
I get the expected value error:
ValueError: time data ' 2018 10' does not match format '%Y %m %d'
What do I miss?
FYI: This question pandas to_datetime not working seems to be related but is different and it seems to be fixed by now. It is working with my pandas version 0.25.2.