I'm trying to read data from a csv and parsing dates but I found this issue in which only one of the columns gets a datetime format and the other one still remains an object.
dtype = {'col1':'category','col2':'category','Start Date':'str','End Date':'str'}
dates = ['col3','col4']
df = pd.read_csv(filepath,dtype=dtype,parse_dates=dates,dayfirst=False)
Both date columns have same format.
when I do df.info()
I get the following:
df.info() output
I tried using dayfirst input and the formatter but it didn't help.
I expect that both columns in the list would get datetime object but for some reason they aren't.
Update: tried to recreate a minimal reproducible data set by doing the code block below but this is behaving as expected, producing both Start Date and End Date columns as datetime.
import pandas as pd
df = pd.DataFrame({'col1':['ABC','ABC','DCF','DCF'],
'Start Date':['12-31-2022','12-31-2022','12-31-2022','12-31-2022'],
'End Date':['12-31-2023','12-31-2023','12-31-2023','12-31-2023']
})
df.to_csv('test.csv',index=0)
df2 = pd.read_csv('test.csv',
dtype={'col1':'category','Start Date':'str','End Date':'str'},
parse_dates = ['Start Date','End Date'],
dayfirst=False
)
df2.info()