0

I have a string which is similar to this

02032021..........SW01SW05..........

I have already extracted everything I need from it except for the date which is the first block of digits. the dates are between Jan 21 and YTD.

I used this code to get only the digits df.loc[x, "Date Worked"] = df["Worked In Cost Number"].str.slice(0,10).astype(str) x is subset which contains strings with the same format.

and then I tried to convert it to date df["Date Worked"] = pd.to_datetime(df["Date Worked"].str.slice(0,2) + df["Date Worked"].str.slice(2,4)+ df["Date Worked"].str.slice(4,8), format = '%m%d%Y')

but when i do that i get this error

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 3202-03-12 00:00:00

I the issue is that when I slice it, it turns into a float and dismisses the leading zero.

eltoukhy
  • 3
  • 2
  • `df["Date Worked"] = pd.to_datetime(df["Worked In Cost Number"].str[:8], format='%m%d%Y')` – BeRT2me Aug 02 '22 at 17:33
  • Maybe your column contains string like this: "03123202..........SW01SW05.........." and that is the reason why you can't convert it to date? Here is the similar problem: https://stackoverflow.com/questions/32888124/pandas-out-of-bounds-nanosecond-timestamp-after-offset-rollforward-plus-adding-a – MaryRa Aug 02 '22 at 17:49

1 Answers1

0

Here's an example of how you can handle those digits, with a mock "date" string. You'll have to separate the date from the string you provided to make this work, probably with slicing as you have already done.

Code:

from datetime import datetime
example_date = "02032021"

print(datetime.strptime(example_date, "%m%d%Y"))

Output:

2021-02-03 00:00:00
Ryan
  • 1,081
  • 6
  • 14