0

I need to read dates from an xlsx file. At first, I tried to load the file in that way:

df = pd.read_excel('file.xls')

And the column "Date" is wrong (i will put an example):

output:
0     2017-03-05
1     2017-03-05
2     2017-03-05
3     2017-03-05
4     2017-03-05
5     2017-03-05
6     2017-03-05
7     2017-03-05
8     2017-03-05
9     2017-03-05
10    2017-03-05
11    2017-03-05
12    2017-03-05
13    2017-05-17
14    2017-05-18
15    2017-05-18
16    2017-05-22
17    2017-05-22

The problem is that from 0 to 12 the date isn't correct because instead of '2017-03-05' (5th of March) it could be '2017-05-03' (3th of May). From 13 to 17 it is in the correct format '2017-05-17' (YYYY-MM-DD) and so on.

So, I tryed to specify conversion during excel load in that way:

df = pd.read_excel('file.xls', converters={'Date':str})

And python printed this values of Date column:

output:
0      2017-03-05 00:00:00
1      2017-03-05 00:00:00
2      2017-03-05 00:00:00
3      2017-03-05 00:00:00
4      2017-03-05 00:00:00
5      2017-03-05 00:00:00
6      2017-03-05 00:00:00
7      2017-03-05 00:00:00
8      2017-03-05 00:00:00
9      2017-03-05 00:00:00
10     2017-03-05 00:00:00
11     2017-03-05 00:00:00
12     2017-03-05 00:00:00
13              05/17/2017
14              05/18/2017
15              05/18/2017
16              05/22/2017
17              05/22/2017

So Python read different format values from column Date.

How can I convert those different format in one unique format "DD/MM/YYYY"?

Thank you in advance.

josè
  • 35
  • 6

1 Answers1

0

Ok, i have done in this way:

df['DATE'] = df['DATE'].apply(lambda x: datetime.datetime.strptime(x, '%Y-%d-%m %H:%M:%S').strftime('%d/%m/%Y') if x.find(':')!=-1 else datetime.datetime.strptime(x, '%m/%d/%Y').strftime('%d/%m/%Y'))

Now it works!

josè
  • 35
  • 6