2

I have a pandas dataframe with date values, however, I need to convert it from dates to text General format like in Excel, not to date string, in order to match with primary keys values in SQL, which are, unfortunately, reordered in general format. Is it possible to do it Python or the only way to convert this column to general format in Excel?

Here is how the dataframe's column looks like:

   ID         Desired Output
1/1/2022        44562
7/21/2024       45494
1/1/1931        11324
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Camilla
  • 111
  • 10
  • What does the output represent ? – Psidom Dec 30 '21 at 20:53
  • Does [this](https://stackoverflow.com/questions/9574793/how-to-convert-a-python-datetime-datetime-to-excel-serial-date-number#comment49502311_9574948) answer your question? – BrokenBenchmark Dec 30 '21 at 20:55
  • they are these dates in General format, if you will change these dates column format in excel from short date to general it will show this output – Camilla Dec 30 '21 at 20:55

2 Answers2

4

Yes, it's possible. The general format in Excel starts counting the days from the date 1900-1-1.

You can calculate a time delta between the dates in ID and 1900-1-1.

Inspired by this post you could do...

data = pd.DataFrame({'ID': ['1/1/2022','7/21/2024','1/1/1931']})
data['General format'] = (
    pd.to_datetime(data["ID"]) - pd.Timestamp("1900-01-01")
    ).dt.days + 2
print(data)
          ID  General format
0   1/1/2022           44562
1  7/21/2024           45494
2   1/1/1931           11324

The +2 is because:

  1. Excel starts counting from 1 instead of 0
  2. Excel incorrectly considers 1900 as a leap year
wjandrea
  • 28,235
  • 9
  • 60
  • 81
RSale
  • 463
  • 5
  • 14
  • 2
    See [my answer here](https://stackoverflow.com/a/65460255/10197418) and follow the links if you want some more background info. – FObersteiner Dec 31 '21 at 12:33
0

Excel stores dates as sequential serial numbers so that they can be used in calculations. By default, January 1, 1900 is serial number 1, and January 1, 2008 is serial number 39448 because it is 39,447 days after January 1, 1900.
-Microsoft's documentation

So you can just calculate (difference between your date and January 1, 1900) + 1

see How to calculate number of days between two given dates

tzman
  • 174
  • 1
  • 1
  • 11