0

i've combined many dateframes but the date is not match as it's look like (datetime & int) as below , it's contains float number and datetime date.

Date
2022-05-31 00:00:00
2022-05-31 00:00:00
2022-05-31 00:00:00
2022-05-31 00:00:00
44713.0
44713.0
44713.0
44713.0
44713.0
44713.0
44713.0

i'm tried to use the below codes but i found error messages (ValueError: mixed datetimes and integers in passed array) or i found this error elso('<' not supported between instances of 'Timestamp' and 'int')

So how can i change the date formatting to be all like dd-mm-yyyy

The full code:-

import pandas as pd
import datetime as dt
import xlrd
from datetime import datetime


    IN_df=pd.concat([IN_df1, IN_df2,IN_df3,IN_df4,IN_df5,IN_df6,IN_df7,IN_df8]).fillna("")
    IN_df=IN_df[(IN_df['Status']=='No Show')&(IN_df['Site']== 'Cairo')]
    IN_df=IN_df.filter(items=['Date','SF ID','Name','Direct Manager','Department','Shift','Status','Reporting Feedback']).reset_index(drop=True).convert_dtypes()
    
    IN_df['Date']=IN_df['Date'].apply(xlrd.xldate_as_datetime, args=(0,))
    IN_df['Date']=pd.to_datetime(IN_df['Date']).dt.date
    
    IN_df.tail()
Mahmoud Badr
  • 61
  • 2
  • 8
  • You need to look at the data in that Series/column and figure out why some of it doesn't conform to the datetime format you are expecting. Then you might need to preprocess the data before you feed it to pd.DataFrame. Unfortunately most of your example code is useless for diagnosing the problem, please read [mre]. [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – wwii Aug 06 '22 at 13:31

1 Answers1

3

Check Below code:

import pandas as pd 


df = pd.DataFrame({'date':['2022-05-31 00:00:00','2022-05-31 00:00:00','44713','2022-05-31 00:00:00','44713']})

df['new_date'] = df.apply(lambda x:  pd.to_datetime(int(x.date), unit='D', origin='1899-12-30') if x.date.isdigit() else pd.to_datetime(x.date) , axis=1)

df

Output:

enter image description here

Updating code as per OP comment below:

df = pd.DataFrame({'date':['2022-05-31 00:00:00','2022-05-31 00:00:00','44743.0','2022-05-31 00:00:00','44713']})

df['new_date'] = df.apply(lambda x:  pd.to_datetime(float(x.date), unit='D', origin='1899-12-30') if x.date[:-2].isdigit() else pd.to_datetime(x.date) , axis=1)

df

Output:

enter image description here

Abhishek
  • 1,585
  • 2
  • 12
  • 15