I have raw data with the unique identifier for each unit mixed into the column with the timings. In order to 'summarise' the data I need to attach the uniqueID for each group of rows into the column. Part of my loop has trimmed off the blurb above, then runs an 'ifelse' checking for text, strsplit that to obtain the uniqueID, then paste down until encountering the next text string, and repeat.
It works but it is incredibly slow and I need to repeat it over a lot of raw data. ( and I don't have access to the origin software to change the shape of the output file.)
Reading through the forums has found solutions for replacing with a single variable but I need a method to extract it from a line in the df.
Example df:
time dist v3 v4
1: 2 10.2 ... ....
2: 3 10.2 ... ....
3: Veh: 123
4: 1 10.2 ... ....
5: 2 10.2 ... ....
6: 3 10.2 ... ....
7: Veh: 456
8: 1 10.2 ... ....
9: 2 10.2 ... ....
v <- 0001
for (m in 1:length(k2$time)) {
if(grepl('Veh', k2$time[m])) {v <- strsplit(k2$time[m], split=":")[[1]][2]} else{ k2$time[m]<-v }
}
By running it as a loop I know it will work down the column pasting until it encounters another text string. The desired result looking like this.
time dist v3 v4
1: 0001 10.2 ... ....
2: 0001 10.2 ... ....
3: Veh: 123
4: 123 10.2 ... ....
5: 123 10.2 ... ....
6: 123 10.2 ... ....
7: Veh: 456
8: 456 10.2 ... ....
9: 456 10.2 ... ....
I then have another line that runs through the whole data.frame and removes the rows containing text so I can summarise
Is anyone aware of a faster solution, perhaps using dplyr or data.frame? I gave it 15 minutes before aborting a runthrough over 922,000 lines of code and I need it to run over several million.
I'm running out of search combinations on Stack Overflow.
Using data.table-1.9.7 and dplyr-0.5.0 on R-3.3.1
EDIT: Apologies, reproducible example:
time <- c(1,2,"Veh: 123", 1:3,"Veh: 456", 1:3)
dist <- c(1:2,"",4:6,"",8:10)
v3 <- c(1:2,"",4:6,"",8:10)
k <-data.frame(time,dist,v3)
k$time <- as.character(k$time)
v <- 0001
for (m in 1:length(k$time)) {
if(grepl('Veh', k$time[m])) {v <- strsplit(k$time[m], split=":")[[1]][2] }else{ k$time[m]<-v }}