0

I have a dataframe with columns of start and finish dates in dot format (31.07.2018) but I can't seem to correctly convert this into the dash format (2018-07-31).

I've tried parsing the dates as the CSV file is loaded in, pd.to_datetime, and df[col].dt.strftime() but none of these methods have given me the result I need.

How can I get the time format I need? And how can I subtract the dates to get the period of time in days?

EDIT:

An example date column:

0     31.07.2018
1     07.08.2018
2     10.08.2018
3     02.08.2018
4     14.07.2018
5     03.08.2018
6     31.07.2018
7     11.08.2018
8     08.08.2018
9     08.08.2018
10    10.08.2018
11    20.07.2018
12    09.08.2018
13    20.07.2018
14    07.08.2018
15    07.08.2018
16    11.08.2018
17    11.08.2018
18    12.08.2018

When I try pd.to_datetime(column), I get the following:

0    2018-07-31
1    2018-07-08
2    2018-10-08
3    2018-02-08
4    2018-07-14
5    2018-03-08
6    2018-07-31
7    2018-11-08
8    2018-08-08
9    2018-08-08
10   2018-10-08
11   2018-07-20
12   2018-09-08
13   2018-07-20
14   2018-07-08
15   2018-07-08
16   2018-11-08
17   2018-11-08
18   2018-12-08

As you can see, the months and days are getting mixed around because I know the months can only be 07 and 08. How can I fix this?

kynnemall
  • 855
  • 9
  • 26

1 Answers1

1

you can convert them with to_datetime

import pandas as pd

df = pd.DataFrame({'D' : ['31.07.2018','30.03.2017']})

df['New_date'] = pd.to_datetime(df.D)
df

edit : if the first 2 digits are the day, you can add the the dayfirst option:

df['New_date'] = pd.to_datetime(df.D,dayfirst=True)

`

steboc
  • 1,161
  • 1
  • 7
  • 17
  • I've tried this approach but some of the rows have the months and days swapped around and I don't know why the function is doing this – kynnemall Aug 14 '18 at 08:42
  • I had to manually re-arrange the numbers of each date to make your function work. Thanks for the help =) – kynnemall Aug 14 '18 at 14:37