1

I'm having a hard time converting what is supposed to be a datetime column from an excel file. When opening it with pandas I get 44710.37680 instead of 5/29/2022 9:02:36. I tried this peace of code to convert it.

df = pd.read_excel(file,'Raw')
df.to_csv(finalfile, index = False)
df = pd.read_csv(finalfile)
df['First LogonTime'] = df['First LogonTime'].apply(lambda x: pd.Timestamp(x).strftime('%Y-%m-%d %H:%M:%S'))

print(df)

And the result I get is 1970-01-01 00:00:00 :c

Don't know if this helps but its an .xlsb file that I'm working with.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Eestrada
  • 27
  • 5
  • Why are you converting to csv, you already made pandas load the excel file and so have access to its data without needing to convert anything? – Mike 'Pomax' Kamermans Jul 14 '22 at 22:05
  • I'm extracting data from that file and creating another – Eestrada Jul 14 '22 at 22:13
  • That's not what your code is doing. It reads in an excel sheet, converts it to CSV, throws away the loaded excel file and loads that CSV file instead, _then_ does data work. Change that order: read in the excel file, do your data work, then export as CSV, so you can rule out the CSV conversion as being a source of bugs. – Mike 'Pomax' Kamermans Jul 14 '22 at 23:36
  • I tried converting it without creating a csv file and I get the same result: 1970-01-01 00:00:00 – Eestrada Jul 15 '22 at 05:29

1 Answers1

1

You can use unit='d' (for days) and substract 70 years:

pd.to_datetime(44710.37680, unit='d') - pd.DateOffset(years=70)

Result:

Timestamp('2022-05-30 09:02:35.520000')

For dataframes use:

import pandas as pd
df = pd.DataFrame({'First LogonTime':[44710.37680, 44757.00000]})
df['First LogonTime'] = pd.to_datetime(df['First LogonTime'], unit='d') - pd.DateOffset(years=70)

Or:

import pandas as pd
df = pd.DataFrame({'First LogonTime':[44710.37680, 44757.00000]})
df['First LogonTime'] = df['First LogonTime'].apply(lambda x: pd.to_datetime(x, unit='d') - pd.DateOffset(years=70))

Result:

          First LogonTime
0 2022-05-30 09:02:35.520
1 2022-07-16 00:00:00.000
René
  • 4,594
  • 5
  • 23
  • 52
  • Thank you, that worked, is there a similar way to convert time column values? For example, when reading the excel file, it reads 07:01:54 as 0.292986111. I can't find a way to convert that either. – Eestrada Jul 20 '22 at 22:48
  • Yes, can be done the same way: `pd.to_datetime(0.292986111, unit='d').strftime('%H:%M:%S')`. Since you only need time, there's no need to offset years. – René Jul 21 '22 at 04:49