1

I have a date column which after using pandas read_csv is represented as the object type. I'm trying to convert it to pandas datetime object but when using pd.to_datetime() it returns incorrect datetime object.

for example, I have dates in this format 01-06-68, where 01 is the day, 06 is the month and 68 is the year. Applying pandas to_datetime() to this string returns 2068-06-01 but should return 1968-06-01, where 06 is a month and 01 is the day.

I tried every possible solution using pandas to_datetime(), python's datetime, pendulum library but still getting an error. How can I solve this problem?

Okroshiashvili
  • 3,677
  • 2
  • 26
  • 40
  • 1
    Possible duplicate of [Python strptime parsing year without century: assume prior to this year?](https://stackoverflow.com/questions/33407127/python-strptime-parsing-year-without-century-assume-prior-to-this-year) – LocoGris Apr 15 '19 at 07:21
  • 2
    So it seems that anything with the %y year below 69 will be attributed a century of 2000, and 69 upwards get 1900 – Frenchy Apr 15 '19 at 07:26

2 Answers2

4

You have a format= parameter in pd.to_datetime
You can probably try df['my_col'] = pd.to_datetime(df['my_col'], format='%d-%m-%Y')

See other way to do it here: Convert Pandas Column to DateTime

ggrelet
  • 1,071
  • 7
  • 22
2

Use:

df['date'] = pd.to_datetime(df['date'].str[:-2] + '19' + df['date'].str[-2:])

Another solution with replace:

df['date'] = pd.to_datetime(df['date'].str.replace(r'-(\d+)$', r'-19\1'))

Sample:

print (df)
       date
0  01-06-70
1  01-06-69
2  01-06-68
3  01-06-67

df['date'] = pd.to_datetime(df['date'].str.replace(r'-(\d+)$', r'-19\1'))
print (df)
        date
0 1970-01-06
1 1969-01-06
2 1968-01-06
3 1967-01-06
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252