1

I have a dataframe with multiple columns and want to convert one of those columns which is a date of floats (excel date format - DDDDD.ttt) to datetime.

At the moment the value of the columns are like this:

42411.0
42754.0

So I want to convert them to:

2016-02-11
2017-01-19
cs95
  • 379,657
  • 97
  • 704
  • 746
Spedo
  • 355
  • 3
  • 13
  • 5
    What does "42411.0" represent (what unit, what format)? – cs95 Dec 26 '18 at 11:38
  • @coldspeed my best guess is Excel format which, IIRC, is days from a specific date. But it's cold and boxing day so I cba searching for something that should be included :P – roganjosh Dec 26 '18 at 11:40
  • 1
    @roganjosh (considering I do not celebrate christmas so I ended up searching),... yes, these are excel format DDDDD.tttt dates counting from 1900-01-00... – cs95 Dec 26 '18 at 11:42
  • If you convert those dates manually, don't forget the [Excel date bug](https://en.m.wikipedia.org/wiki/Year_1900_problem#Microsoft_Excel). – PM 2Ring Dec 26 '18 at 11:46

1 Answers1

1

Given

# s = df['date']
s

0    42411.0
1    42754.0
Name: 0, dtype: float64

Convert from Excel to datetime using:

s_int = s.astype(int)
# Correcting Excel Leap Year bug.
days = pd.to_timedelta(np.where(s_int > 59, s_int - 1, s_int), unit='D')
secs = pd.to_timedelta(
    ((s - s_int) * 86400.0).round().astype(int), unit='s')

pd.to_datetime('1899/12/31') + days + secs

0   2016-02-11
1   2017-01-19
dtype: datetime64[ns]

Reference.

cs95
  • 379,657
  • 97
  • 704
  • 746