33

I have a Value 38142 I need to convert it into date format using python. if use this number in excel and do a right click and format cell at that time the value will be converted to 04/06/2004 and I need the same result using python. How can I achieve this

jfs
  • 399,953
  • 195
  • 994
  • 1,670
sudeep Krishnan
  • 668
  • 1
  • 6
  • 23
  • That's a weird ordinal; are you sure 04/06/2004 is correct? If the value 38142 stands for *days* then that'd be an offset from either 1993/12/25 or 1993/10/27 depending on what you interpret as the month. – Martijn Pieters Apr 01 '15 at 09:29
  • [Formula to convert date to number](http://stackoverflow.com/q/19721416) suggests it should be a number of days since 1900/01/01, which is what `date.fromordinal()` does. But that number is missing a digit then. – Martijn Pieters Apr 01 '15 at 09:30
  • My file have the value I don't know its ordinal or not my client says its ordinal and told me that "if you want find the actual date just do format cell in excel for the given value at that time I am getting this value" @MartijnPieters – sudeep Krishnan Apr 01 '15 at 09:39
  • yeah, it is indeed an ordinal, but there's a bug in Excel which caused me to discount my initial theory. – Martijn Pieters Apr 01 '15 at 09:51
  • related, older question: [How to convert a python datetime.datetime to excel serial date number](https://stackoverflow.com/q/9574793/10197418) – FObersteiner Apr 16 '21 at 06:17

6 Answers6

48

The offset in Excel is the number of days since 1900/01/01, with 1 being the first of January 1900, so add the number of days as a timedelta to 1899/12/31:

from datetime import datetime, timedelta

def from_excel_ordinal(ordinal: float, _epoch0=datetime(1899, 12, 31)) -> datetime:
    if ordinal >= 60:
        ordinal -= 1  # Excel leap year bug, 1900 is not a leap year!
    return (_epoch0 + timedelta(days=ordinal)).replace(microsecond=0)

You have to adjust the ordinal by one day for any date after 1900/02/28; Excel has inherited a leap year bug from Lotus 1-2-3 and treats 1900 as a leap year. The code above returns datetime(1900, 2, 28, 0, 0) for both 59 and 60 to correct for this, with fractional values in the range [59.0 - 61.0) all being a time between 00:00:00.0 and 23:59:59.999999 on that day.

The above also supports serials with a fraction to represent time, but since Excel doesn't support microseconds those are dropped.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 1
    @Krish: the bug is popularized by Joel Spolsky: [My First BillG Review](http://www.joelonsoftware.com/items/2006/06/16.html) – jfs Apr 02 '15 at 20:07
  • Are you sure the epoch is not December 31, 1899? `datetime(1899, 12, 31) + timedelta(ordinal - (ordinal > 59))` – jfs Apr 02 '15 at 20:20
  • @J.F.Sebastian I stuck to the documentation for Excel here; it makes little difference here to subtract one relative to 1900-01-01. – Martijn Pieters Apr 02 '15 at 21:00
  • makes no sense to have _epoch as a parameter if we hard code the ordinal check for being > 59. – FinanceGuyThatCantCode Oct 29 '18 at 17:36
  • 2
    @FinanceGuyThatCantCode: The `_epoch` parameter is there to cache the value as a *local variable*, nothing more. This helps avoid having to create it for each call, or to have to look up a global (slightly slower). – Martijn Pieters Oct 29 '18 at 17:40
  • @MartijnPieters - I see - hence the leading _ in the _epoch parameter. I have not done that before. I guess it is a known PEP thingy as a function parameter then to be thought of sort of like a static local variable in C++ or something. – FinanceGuyThatCantCode Oct 29 '18 at 17:58
7
from datetime import datetime, timedelta

def from_excel_ordinal(ordinal, epoch=datetime(1900, 1, 1)):
    # Adapted from above, thanks to @Martijn Pieters 

    if ordinal > 59:
        ordinal -= 1  # Excel leap year bug, 1900 is not a leap year!
    inDays = int(ordinal)
    frac = ordinal - inDays
    inSecs = int(round(frac * 86400.0))

    return epoch + timedelta(days=inDays - 1, seconds=inSecs) # epoch is day 1

excelDT = 42548.75001           # Float representation of 27/06/2016  6:00:01 PM in Excel format  
pyDT = from_excel_ordinal(excelDT)

The above answer is fine for just a date value, but here I extend the above solution to include time and return a datetime values as well.

David
  • 83
  • 1
  • 5
2

I would recomment the following:

import pandas as pd

def convert_excel_time(excel_time):

    return pd.to_datetime('1900-01-01') + pd.to_timedelta(excel_time,'D')

Or

import datetime

def xldate_to_datetime(xldate):
    temp = datetime.datetime(1900, 1, 1)
    delta = datetime.timedelta(days=xldate)
    return temp+delta

Is taken from https://gist.github.com/oag335/9959241

Michael D
  • 1,711
  • 4
  • 23
  • 38
  • xldate_to_datetime(44000) gives 2020-06-20 where as the answer is 2020-06-18 – Poorna Prudhvi Oct 05 '20 at 15:58
  • 1
    @PoornaPrudhvi is correct; the base date should be 1899-12-30. One day offset because we should be adding to Dec 31 and another day offset b/c of the leap year bug mention in the accepted answer. – JJL Oct 14 '21 at 19:11
1

I came to this question when trying to do the same above, but for entire columns within a df. I made this function, which did it for me:

import pandas as pd    
from datetime import datetime, timedelta
import copy as cp

def xlDateConv(df, *cols):      
    tempDt = []
    fin = cp.deepcopy(df)
    for col in [*cols]:
        for i in range(len(fin[col])):
            tempDate = datetime(1900, 1, 1)
            delta = timedelta(float(fin[col][i]))
            tempDt.append(pd.to_datetime(tempDate+delta))

        fin[col] = tempDt
        tempDt = []
    return fin

Note that you need to type each column, quoted (as string), as one parameter, which can most likely be improved (list of columns as input, for instance). Also, it returns a copy of the original df (doesn't change the original).

Btw, partly inspired by this (https://gist.github.com/oag335/9959241).

Alan
  • 75
  • 1
  • 7
1

If you are working with Pandas this could be useful

    import xlrd
    import datetime as dt
    
    def from_excel_datetime(x):
        return dt.datetime(*xlrd.xldate_as_tuple(x, datemode=0))
    
    df['date'] = df.excel_date.map(from_excel_datetime)

If the date seems to be 4 years delayed, maybe you can try with datemode 1.

:param datemode: 0: 1900-based, 1: 1904-based.

Mariana
  • 126
  • 1
  • 5
0

I had the same problem and then I used this function: (source: https://gist.github.com/OmarArain/9959241)

import datetime
def xldate_to_datetime(xldate):
    xldate = int(xldate)
    temp = datetime.datetime(1900, 1, 1)
    delta = datetime.timedelta(days=xldate)
    return temp+delta

And then I applied it to my dataframe:

df['column_date'] = df['column_date'].apply(lambda x: xldate_to_datetime(x))
chaima
  • 1
  • 1