0

I need to convert a manually inputted field to date. It looks like:

Jul, 4th
Aug, 29th

I honestly don't know how I can remove the ordinal piece (st, nd, th) and convert to date. I tried this approach:

import pandas as pd
import datetime

dt = pd.DataFrame(data=[['July 5th'],['Aug 29th'], ['Sep 2nd']], columns=['Date_temp'])

dt['month'] = dt['Date_temp'].str[0:3]
dt['day_temp'] = dt['Date_temp'].str[4:6]
dt['day'] =  dt['day_temp'].str[0]

dt['year'] = datetime.datetime.now().year

dt['Date'] = pd.to_datetime(dt[['month','day','year']])

I got this error:

 ValueError: Unable to parse string "Jul" at position 0
  • If you just want a datetime column. Pandas is pretty smart at handling the conversions... `dt['date'] = pd.to_datetime(dt['Date_temp'] + f' {pd.Timestamp.now().year}')` Extraction would be trivial after that `dt['month'] = dt['date'].dt.month` and `dt['day'] = dt['date'].dt.day` – Henry Ecker Aug 05 '22 at 03:14

1 Answers1

1
import pandas as pd
import datetime
import re

year = datetime.datetime.now().year
dates = pd.Series(['July 5th', 'Aug 29th', 'Sep 2nd'])
dates = dates.apply(lambda d: f"{year}-{d[:3]}-{re.sub('[^0-9]', '', d)}")
dates = pd.to_datetime(dates, format='%Y-%b-%d')

prints

0   2022-07-05
1   2022-08-29
2   2022-09-02
dtype: datetime64[ns]
Michael Hodel
  • 2,845
  • 1
  • 5
  • 10