I have a very large data set (CSV) with information about bicycle counts from a bike share system. The information I'm working with is the time at which bicycles were taken out of the racks (departure time) and also the total travel time. What I want to do is to add them so I can get the arrival time at the arrival station. The departure time variable is FECHA_HORA_RETIRO and the travel time variable is TIEMPO_USO. The former, which is read by R as factor object, is in the following format: "23/01/2017 19:55:16". On the other hand, TIEMPO_USO is read by R as a character and it's in the following format: "0:17:46".
> head(viajes_ecobici_2017$FECHA_HORA_RETIRO)
[1] 28/01/2017 13:51 17/01/2017 16:24 12/01/2017 16:38 25/01/2017 10:31
> head(viajes_ecobici_2017$TIEMPO_USO)
[1] "1:35:37" "0:11:17" "0:32:51" "0:31:29" "1:31:59" "0:21:43" "0:5:43"
I first used strptime to get everything in the desired format
> viajes_ecobici_2017$FECHA_HORA_RETIRO =format(strptime(viajes_ecobici_2017$FECHA_HORA_RETIRO,format = "%d/%m/%Y %H:%M"),format = "%d/%m/%Y %H:%M:%S")
> viajes_ecobici_2017$TIEMPO_USO = format(strptime(viajes_ecobici_2017$TIEMPO_USO, format="%H:%M:%S"), format="%H:%M:%S")
This works with most observations. However, several observations became NA values after running this code. I went back to the original data to see why this was happening and created a variable with just the observations that became NA. When I looked closer at this observations I saw they have this format "\t\t01/06/2017 00:01". How can I get rid of the "\t\t" while preserving the rest of the information?
Thanks in advance for your help.