0

I have an Excel .xlsb sheet with data, some columns have number as output data, other columns should have dates as output. After uploading the data in Python, some columns have a number in stead of date. How can I convert the number in that specific column to a date?

I tried to find an answer, but didn't succeed

Columns Datafile Date of Birth, Age, Gender Rows 1 integer must be date integer String 2 integer must be date integer String etc. 3 4 5

1 Answers1

0

In Excel the date is represented as the integer part of the number, and counts the number of days since 1899-12-31. For example 1 is 1900-01-01.

However you need to be aware there is a "bug" in Excel, it wrongly considers 1900 to be a leap year, i.e. it counts the date 1900-02-29 as a valid date. This bug was intentionally added to Excel so it would reproduce the wrong behaviour of Lotus, that was the most popular Windows spreadsheet software at the time Excel was created. So assuming you do not care about dates before 1900-03-01, then you could instead assume the number is a count of the number of days since 1899-12-30.

import datetime

def xldate2date(xl):
  # valid for dates from 1900-03-01
  basedate = datetime.date(1899,12,30)
  d = basedate + datetime.timedelta(days=xl)
  return d

# Example:
# >>> print(xldate2date(44948))
# 2023-01-22
joe90
  • 51
  • 4
  • I understand the problem in Excel. Thank you for the correct code. I've run the code, but nothing changed. This is my first experience with Python, so I struggle a bit. I would expect that I have to select the specific column that should have dates as outcome (instead of excel integers) – Marry Beuker Jan 23 '23 at 19:14
  • @MarryBeuker it might be helpful to give an example of the data you extracted from Excel. For example the Excel serial date 44948 is the calendar date 2023-01-22. – joe90 Jan 23 '23 at 19:27
  • Hi Joe, the Printfunction works. What I also needed, is to format a whole kolom from "integer" to dates. – Marry Beuker Jan 24 '23 at 17:25
  • the format of the table is: Date of Birth, Age, Gender (these are the cols). There are more than 500.000 rows. So I want to select the full column and change the format from integer (excel date) to real dates. – Marry Beuker Jan 24 '23 at 17:28