-1

I have one column called Date which is of object type having both date and time in the form '2019/10/07,12:44:58'. I have tried slicing out the date part from this Date column then convert this to the proper date format. I want to apply this function on the date column to create a new column called date1 without using for loop.

anadi
  • 63
  • 1
  • 6
  • Use `data['date1'] = pd.to_datetime(data['Date'], format= '%Y/%m/%d').dt.date` – Erfan Oct 25 '19 at 12:12
  • Its showing error as "unknown string format" – anadi Oct 25 '19 at 12:16
  • edited my comment, you have to specify the `format` – Erfan Oct 25 '19 at 12:17
  • again error is coming as "ValueError: time data 2019/10/07,12:45:03 doesn't match format specified" – anadi Oct 25 '19 at 12:19
  • Please just look at the error, it shows what the problem is.. `format='%Y/%m/%s,%H:%M:%S'` – Erfan Oct 25 '19 at 12:21
  • In my dataset there is a comma in between the date and time, I guess thats why error is coming. Can you solve this error? – anadi Oct 25 '19 at 12:24
  • Yes man, I have seen that, As i have said my dataset has a comma in between thats why this code is not working . Its a different format Can you see the comma ? '%Y/%m/%s,%H:%M:%S' so can you provide solution for this one? – anadi Oct 25 '19 at 12:35
  • Your data has different formats, the comma is not the problem. Look at the first two rows, the data is different from the rest. – Erfan Oct 25 '19 at 13:05

1 Answers1

0

As we can see, the first two rows have a different date format. So we convert the rest to datetime first with errors='coerce'. Then we convert the first two rows and use fillna, to get both dates together:

date1 = pd.to_datetime(data['Date'], format='%Y/%m/%d,%H:%M:%S', errors='coerce')
date2 = pd.to_datetime(data.loc[date1.isna(), 'Date'], format='%d-%m-%Y,%H:%M:%S')

data['Date'] = date1.fillna(date2)

                 Date     Open     High      Low    Close   Qty  Value(Lk)  \
0 2019-10-07 12:45:17  1208.65  1208.85  1208.40  1208.85  1125      13.60   
1 2019-10-07 12:45:00  1208.70  1209.10  1208.40  1209.10  9344     112.95   
2 2019-10-07 12:43:58  1208.80  1209.40  1208.35  1208.65  7342      88.75   
3 2019-10-07 12:42:58  1208.70  1209.20  1208.40  1209.00  9355     113.08   
4 2019-10-07 12:41:57  1208.75  1209.00  1207.80  1208.35  5890      71.17   

   Trades BS  
0       4     
1      15     
2      13     
3      15     
4      13     

Original data:

                  Date     Open     High      Low    Close   Qty  Value(Lk)  \
0  07-10-2019,12:45:17  1208.65  1208.85  1208.40  1208.85  1125      13.60   
1  07-10-2019,12:45:00  1208.70  1209.10  1208.40  1209.10  9344     112.95   
2  2019/10/07,12:43:58  1208.80  1209.40  1208.35  1208.65  7342      88.75   
3  2019/10/07,12:42:58  1208.70  1209.20  1208.40  1209.00  9355     113.08   
4  2019/10/07,12:41:57  1208.75  1209.00  1207.80  1208.35  5890      71.17   

   Trades BS  
0       4     
1      15     
2      13     
3      15     
4      13     
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • thanks for the solution, can you explain me the significance of pd.to_datetime(data.loc[date1.isna(), 'Date']) – anadi Oct 25 '19 at 14:19