0

I want to convert the date column in dataframe with different formats to python datetime. The function pd.to_datetime(df['date'], infer_datetime_format=True) is working only till year 3000.

Please find below the example import pandas as pd

#Create the pandas DataFrame
data = [['A', '2021-08-08'], ['B', '2021/08/08'], ['C', '3031-08-08']]
df = pd.DataFrame(data, columns = ['Name', 'Date'])
Name Date
A 2021-08-08
B 2021/08/08
C 3031-08-08
  1. pd.to_datetime(df['Date']) # giving errors

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 3031-08-08 00:00:00

  1. pd.to_datetime(df['Date'], errors = 'coerce') # converting year beyond 3000 to NaT
Name Date
A 2021-08-08
B 2021-08-08
C NaT

Any solutions?

1 Answers1

0

Unfortunately, pandas timestamp doesn't support dates past 2262-04-11 23:47:16. pandas.Timestamp.max

ilmarinen
  • 4,557
  • 3
  • 16
  • 12
  • 1
    Thanks, is there any alternate way to achieve the solution? Could have used the mapping if only one date format was there however it is not the case. – richa bharwal Aug 19 '21 at 11:46
  • 1
    I think your best bet is to embed a different type of object in the timeseries, so instead of using pandas timestamps, a decent bet would be to use standard datetimes., like shown in this answer from the suggested has-an-answer-already tag: https://stackoverflow.com/a/63671618/1024114 `df['F'].apply(lambda x: datetime.datetime.strptime(x, '%m/%d/%Y %I:%M:%S') if type(x)==str else np.NaN) ` – ilmarinen Aug 19 '21 at 13:38