I have one column called Date which is of object type having both date and time in the form '2019/10/07,12:44:58'. I have tried slicing out the date part from this Date column then convert this to the proper date format. I want to apply this function on the date column to create a new column called date1 without using for loop.
Asked
Active
Viewed 46 times
-1
-
Use `data['date1'] = pd.to_datetime(data['Date'], format= '%Y/%m/%d').dt.date` – Erfan Oct 25 '19 at 12:12
-
Its showing error as "unknown string format" – anadi Oct 25 '19 at 12:16
-
edited my comment, you have to specify the `format` – Erfan Oct 25 '19 at 12:17
-
again error is coming as "ValueError: time data 2019/10/07,12:45:03 doesn't match format specified" – anadi Oct 25 '19 at 12:19
-
Please just look at the error, it shows what the problem is.. `format='%Y/%m/%s,%H:%M:%S'` – Erfan Oct 25 '19 at 12:21
-
In my dataset there is a comma in between the date and time, I guess thats why error is coming. Can you solve this error? – anadi Oct 25 '19 at 12:24
-
Yes man, I have seen that, As i have said my dataset has a comma in between thats why this code is not working . Its a different format Can you see the comma ? '%Y/%m/%s,%H:%M:%S' so can you provide solution for this one? – anadi Oct 25 '19 at 12:35
-
Your data has different formats, the comma is not the problem. Look at the first two rows, the data is different from the rest. – Erfan Oct 25 '19 at 13:05
1 Answers
0
As we can see, the first two rows have a different date format. So we convert the rest to datetime first with errors='coerce'
. Then we convert the first two rows and use fillna
, to get both dates together:
date1 = pd.to_datetime(data['Date'], format='%Y/%m/%d,%H:%M:%S', errors='coerce')
date2 = pd.to_datetime(data.loc[date1.isna(), 'Date'], format='%d-%m-%Y,%H:%M:%S')
data['Date'] = date1.fillna(date2)
Date Open High Low Close Qty Value(Lk) \
0 2019-10-07 12:45:17 1208.65 1208.85 1208.40 1208.85 1125 13.60
1 2019-10-07 12:45:00 1208.70 1209.10 1208.40 1209.10 9344 112.95
2 2019-10-07 12:43:58 1208.80 1209.40 1208.35 1208.65 7342 88.75
3 2019-10-07 12:42:58 1208.70 1209.20 1208.40 1209.00 9355 113.08
4 2019-10-07 12:41:57 1208.75 1209.00 1207.80 1208.35 5890 71.17
Trades BS
0 4
1 15
2 13
3 15
4 13
Original data:
Date Open High Low Close Qty Value(Lk) \
0 07-10-2019,12:45:17 1208.65 1208.85 1208.40 1208.85 1125 13.60
1 07-10-2019,12:45:00 1208.70 1209.10 1208.40 1209.10 9344 112.95
2 2019/10/07,12:43:58 1208.80 1209.40 1208.35 1208.65 7342 88.75
3 2019/10/07,12:42:58 1208.70 1209.20 1208.40 1209.00 9355 113.08
4 2019/10/07,12:41:57 1208.75 1209.00 1207.80 1208.35 5890 71.17
Trades BS
0 4
1 15
2 13
3 15
4 13

Erfan
- 40,971
- 8
- 66
- 78
-
thanks for the solution, can you explain me the significance of pd.to_datetime(data.loc[date1.isna(), 'Date']) – anadi Oct 25 '19 at 14:19