-1

i have a data frame that looks like this

Time                 DT5.0_Prediction
20:10:36.051 IST            3
20:10:36.150 IST            3
20:10:36.251 IST            3
20:10:36.350 IST            3
20:10:36.450 IST            3
20:10:36.551 IST            1
20:10:36.651 IST            1
20:10:36.750 IST            1
20:10:36.851 IST            3
20:10:36.952 IST            1
20:10:37.051 IST            1
20:10:37.151 IST            1
20:10:37.252 IST            1
20:10:37.351 IST            3
20:10:37.452 IST            1
20:10:37.551 IST            1
20:10:37.652 IST            1
20:10:37.752 IST            3
20:10:37.853 IST            1
20:10:37.953 IST            1
20:10:38.053 IST            1
20:10:38.152 IST            1
20:10:38.252 IST            1
20:10:38.352 IST            1
20:10:38.453 IST            1
20:10:38.554 IST            1

I want to use window size of 10 and get the data to be like this

  Starting Time        Ending time      Mode
20:10:36.051 IST    20:10:36.952 IST     3
20:10:37.051 IST    20:10:37.953 IST     1
20:10:38.053 IST    20:10:38.955 IST     1

and so on

In mode column from the above table, the number "3" is the most number of times repeated in that particular window and "1" is the most number of times repeated in the next consecutive window.

i used the following code

 a <- 1

for(i in 1: length(mydata[,2])){

b <- a+99
mydata$StartTime [i] <- mydata$Time[a]
mydata$EndTime [i] <- mydata$Time[b]
mydata$mode1234567 [i] <- ifelse( b <= nrow(mydata),
                       count(mydata[a:min(b, nrow(mydata)),2]),
                       NA)

a <- b+1
}

using frequency and count is wrong...

thanks in advance

Sotos
  • 51,121
  • 6
  • 32
  • 66
Kumar
  • 169
  • 1
  • 16

1 Answers1

1

One way is to split every 10 rows, and create a data frame based on each element, i.e.

do.call(rbind, 
        lapply(split(df, (0:nrow(df) %/% 10)), function(i)
                                      data.frame(Starting_Time = i[1,1], 
                                                 Ending_Time = i[nrow(i),1], 
                                                 mode = Mode(i[[2]]))))

which gives,

     Starting_Time      Ending_Time mode
0 20:10:36.051_IST 20:10:36.952_IST    3
1 20:10:37.051_IST 20:10:37.953_IST    1
2 20:10:38.053_IST 20:10:38.554_IST    1

Where Mode is simply a custom function to calculate the mode, taken from this answer.

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • so.. if the mode number is same for consecutive rows, then how to combine the rows with same mode number and get the corresponding start time and end time.....? – Kumar Jan 23 '18 at 11:29
  • I am not sure what you mean. – Sotos Jan 23 '18 at 13:15