0

My CSV has one of the column has the following date format.

5/2/2010
12/2/2010
19-02-2010
26-02-2010
5/3/2010
12/3/2010
19-03-2010
26-03-2010
2/4/2010
9/4/2010

When I read csv file & print the data frame I am getting as expected as below.

file_path = r'Store_sales.csv'
date_series_data = pd.read_csv(file_path)
date_series_data.head()

output

05-02-2010      
12-02-2010      
19-02-2010  
26-02-2010  
05-03-2010

When I print the data type it's shows Object data type. So i am not able to set as index. So I used pd.to_datetime(date_series_data) convert object to datetime64[ns]. But the dateformat of first two element got changed as below

2010-05-02  
2010-12-02  
2010-02-19  
2010-02-26      
2010-03-05

Due to this my various calculation goes wrong. Is there effective to way to convert & get similar format?

thangaraj1980
  • 141
  • 2
  • 11
  • Could you post the calculations that are going wrong with this? It sounds like your later calculations might be designed to handle strings with a particular format, rather than datetime objects. If that's the case, you probably want to either convert your dates back to strings (`df['col'] = df['col'].astype(str)`) or even better, change those calculations to handle datetime objects. – Jacob Sep 23 '20 at 19:00
  • try `pd.to_datetime(date_series_data, dayfirst=True)` - by default, it is assumed that the month comes first, which is not the case in your input. – FObersteiner Sep 24 '20 at 05:57

1 Answers1

0

Try:

import pandas as pd

# Read excel file
date_series_data = pd.read_csv(filepath)

# 'Date' is the column name which has date entries
# I have set time format as YDM. You can change it to other using dt.strftime
date_type_one = pd.to_datetime(date_series_data['Date'], errors='coerce', format='%Y-%d-%m %H:%M:%S').dt.strftime('%Y-%d-%m')
date_type_two = pd.to_datetime(date_series_data['Date'], errors='coerce', format='%d-%m-%Y').dt.strftime('%Y-%d-%m')
date_series_data['Date'] = date_type_one.fillna(date_type_two)

date_series_data.head()

References: Working with mixed datetime formats in pandas

Abhay
  • 585
  • 3
  • 9