0

I am trying to convert a raw csv file that has a value of 12/31/99 in a column in pandas. I want to convert it to 12/31/2099. But my code converts it to 12/31/1999 instead.

If possible, I don't want to resort to hacks (like doing string manipulation and add 20 before the year number)

I am using these formatting placeholders: %m-%d-%y

Is there a way to control how %y is converted? Below shows a simple program that illustrates my conversion pseudocode to check how conversion works on 00-99 year. Basically, I want to control how it behaves 70+.

for _ in range(60 , 100, 10):
    t = "03-01-{0}".format(str(_).zfill(2))
    x = datetime.strptime(t, '%m-%d-%y')
    print(t," ",x)

Example:

input :

03-01-60
03-01-70
03-01-80
03-01-90

output : (notice 1970, 1980, 1990)

03-01-60   2060-03-01 00:00:00
03-01-70   1970-03-01 00:00:00
03-01-80   1980-03-01 00:00:00
03-01-90   1990-03-01 00:00:00

expected output :

03-01-60   2060-03-01 00:00:00
03-01-70   2070-03-01 00:00:00
03-01-80   2080-03-01 00:00:00
03-01-90   2090-03-01 00:00:00
Earl
  • 420
  • 5
  • 16
  • " I want to convert it to 12/31/2099" Okay, how do you *know* that this is the result you want? You're adding information that isn't there. Which is exactly why dates of this sort are problematic. – Karl Knechtel Mar 23 '21 at 03:01
  • the date column is maturity date, normally when a loan matures. so as an assumption, the data that are coming in are new loans, and they mature 2022 onwards. – Earl Mar 23 '21 at 03:02
  • 4
    It's not a hack to adjust it yourself. The rule is that two digits years run from 1970 to 2069. If your data doesn't fit the rule, then you need to fix it. It is hard for me to imagine that ANYONE in the past 30 years has recorded dates using two digits years. – Tim Roberts Mar 23 '21 at 03:05
  • 1
    [Here is a well-received answer for implementing a similar check](https://stackoverflow.com/a/16600636/13386979). I'm not aware of any `datetime` parameter/setting for specifying the century, but this approach is simple enough. – Tom Mar 23 '21 at 03:09
  • got it. thanks. looks like there is no direct way. it is such that there are a lot of transformations already happening across all columns in our code. to add a special rule for a specified file is problematic when the code caters to lots of files. It is just that our python code would be the first entry point of raw data but we are task to make it type safe already - ie. convert it to yyyy-MM-dd HH:mm:ss format already. – Earl Mar 23 '21 at 03:10

0 Answers0