You didn't provide enough details in your question, but if your date
column is already a date value rather than a string value, I'm guessing your problem was during the initial load of the dates.
If you loaded your data via pandas pd.read_csv()
then there are a lot of options for loading dates, including options that will try to detect the date format automatically. Several (but not all rows) in your sample data would confuse this automatic detection (it can't tell which part is the month or day).
If the date
column is already a date value, then
df['date'] = pd.to_datetime(df['date'].astype(str), format='%Y-%m-%d')
will not do anything useful.
If you did use pd.read_csv()
and the dates are YYYY-MM-DD on disk try using this instead:
import pandas as pd
import numpy as np
myDateLoader = lambda d: np.datetime64('NaT') if d == '' or d == 'NULL' or d.startswith('9999-12-31') else np.datetime64(datetime.strptime(d[:10], '%Y-%m-%d'))
df = pd.read_csv('file.csv', converters={'date': myDateLoader})
If they are not YYYY-MM-DD on disk, then adjust the above format as needed.