2

I have a date column of format YYYY-MM-DD and want to convert it to an int type, consecutively, where 1= Jan 1, 2000. So if I have a date 2000-01-31, it will convert to 31. If I have a date 2020-01-31 it will convert to (365*20yrs + 5 leap days), etc.

Is this possible to do in pandas?

I looked at Pandas: convert date 'object' to int, but this solution converts to an int 8 digits long.

frank
  • 3,036
  • 7
  • 33
  • 65

1 Answers1

2

First subtract column by Timestamp, convert timedelts to days by Series.dt.days and last add 1:

df = pd.DataFrame({"Date": ["2000-01-29", "2000-01-01", "2014-03-31"]})
d = '2000-01-01'
df["new"] = pd.to_datetime(df["Date"]).sub(pd.Timestamp(d)).dt.days + 1
print( df )
         Date   new
0  2000-01-29    29
1  2000-01-01     1
2  2014-03-31  5204
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Love it. I just thought of subtracting the date 2000-01-01, but the answer is all in days, and I do not know how to convert to an int. You suggestion worked perfectly! – frank Aug 28 '19 at 10:03