1

I am trying to read Excel that has date values in one column. They are however in 2 different formats:

03.07.2017
03.07.2017
30/06/2017
30/06/2017
30/06/2017
03.07.2017
03.07.2017

07 and 06 are month numbers.

I import excel with:

denik=pd.read_excel('obchodnidenik.xlsx',converters={'Vstup - datum':str})

However the dates are converted differently.

I get two date formats that have switched dates/months:

'30/06/2017'
'2017-03-07 00:00:00'

What is the best way to convert all values into correct datetime?

Fabio Lamanna
  • 20,504
  • 24
  • 90
  • 122
PeterL
  • 465
  • 1
  • 6
  • 14
  • 2
    Don't parse the dates while reading, replace `.` with `/` in strings, and then `pd.to_datetime`? – Zero Jul 11 '17 at 15:12
  • But the parsing is automatic - I even tried to force "str" format, but it does not help. – PeterL Jul 11 '17 at 15:16
  • Forcing converter type does work, are you doing it on right columns with right syntax https://stackoverflow.com/a/41117204? – Zero Jul 11 '17 at 15:18
  • U use syntax: denik=pd.read_excel('obchodnidenik.xlsx',converters={'Vstup - datum':str}) And the column 'Vstup - datum' is correct. – PeterL Jul 11 '17 at 15:20
  • have you tried openpyxl? gives more flexibity of editing and importing... – Dr.Raghnar Jul 11 '17 at 15:30

1 Answers1

1

You can use Series.replace after reading the excel and then convert it into proper datetime and set dayfirst = True for proper months example:

n = ['03.07.2017','03.07.2017','30/06/2017','30/06/2017','30/06/2017','03.07.2017','03.07.2017']
df = pd.DataFrame(n)
df[0]=df[0].replace('[/\/.]','-',regex=True)
df[0] = pd.to_datetime(df[0],dayfirst=True)

Output:

0   2017-07-03
1   2017-07-03
2   2017-06-30
3   2017-06-30
4   2017-06-30
5   2017-07-03
6   2017-07-03
Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108