I'm loading the excel(xlsx) file exported from dashboard in OpenPyxl and some of date values are defaulting into 44774.375 and it should be converted to 01-08-2022 09:00:00 to load into SF table.
Also In dashboard Users are giving 4/21/2023 10:00:00 PM this format but I should convert into 21-04-2023 10:00:00
not sure how to apply this both formatting condition in columns FE RETA, RETA.
I'm currently using the below code,
import pandas as pd
from openpyxl import load_workbook
wb = load_workbook(filename= "file.xlsx", data_only = True)
sheet_names = wb.get_sheet_names()
name = sheet_names[0]
sheet_ranges = wb[name]
df = pd.DataFrame(sheet_ranges.values, index = none)
print(df)
#To replace (4/21/2023 10:00:00 PM) format to (21-04-2023 10:00:00)
df['FE RETA'] = pd.to_datetime(df['FE RETA'], format = '%m/%d/%Y %I:%M:%S %p', errors='coerce')
df['FE RETA'] =df['FE RETA'].dt.strftime('%d-%m-%Y %H:%m:%S')
#To replace all remaining number format to (21-04-2023 10:00:00)
#but its only working if the entire column is in number format
df['FE RETA'] = pd.to_datetime(df['FE RETA'],unit='d') + dt.datetime(1899,12,30)
print(df)
but however my date columns having also proper datetime format values too. So it throwing me the error.
TypeError: Invalid type for timedelta scalar: <class 'datetime.datetime'>
Source file:
Tht| FE RETA | RETA | TPM SLACycletime
---------------------------------------------------------------------------
US | 4/21/2023 10:00:00 PM |4/21/2023 10:30:00 AM | Invalid Data
US | 4/22/2023 11:45:00 PM |44781.54167 | 558:19:30
US | 4/21/2023 10:30:00 AM |10-09-2022 18:03:00 | 111:44:26
US | 01-08-2022 10:00:00 |44778.41667 | 15:44:26
US | 44774.375 |44775.52083 | Invalid Data
Desired Output:
Tht| FE RETA | RETA | TPM SLACycletime
---------------------------------------------------------------------------
US | 21-04-2023 10:00:00 |21-04-2023 10:30:00 | Invalid Data
US | 22-04-2023 11:45:00 |08-08-2022 13:00:00 | 558:19:30
US | 21-04-2023 10:30:00 |10-09-2022 18:03:00 | 111:44:26
US | 01-08-2022 10:00:00 |05-08-2022 10:00:00 | 15:44:26
US | 01-08-2022 09:00:00 |02-08-2022 12:30:00 | Invalid Data
User is not agreeing to correct it while uploading, he's expecting to clean the date format in automation itself. Please help on this.
Thanks