4

I have a timeseries of gps data, that needs to be segmented into smaller parts based on gaps in the time stamps.

As an example, consider the following data frame, I want to add a segment number that segments each 'chunks' of time stamps, effectively spitting the data each time there is a gap in the time series of at least 30 seconds.

The resulting data.frame would look something like this:

   timestamp segment
1          1       1
2          3       1
3          5       1
4         10       1
5         42       2
6         45       2
7         92       3
8        156       4
9        160       4
10       162       4
11       163       4
12       164       4
13       200       5
14       203       5

Any way of doing this effectively? The data.frame is a grouped tbl_df (dplyr package) with several distinct time series and can be quite large.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Hejlesen
  • 51
  • 5
  • 4
    Why in the world would you provide an HTML table when your question is about R? Please provide a `data.frame` object using the `dput` function. – tblznbits Jan 29 '16 at 14:33
  • Made it a bit more readable, thank you for the suggestion. – Hejlesen Jan 29 '16 at 14:43

2 Answers2

5

Your example data

t <- c(1, 3, 5, 10, 42, 45, 92, 156, 160, 162, 163, 164, 200, 203)

Segment numbers

s <- cumsum(c(TRUE,diff(t)>=30))

Output

data.frame(timestamp=t,segment=s)
   timestamp segment
1          1       1
2          3       1
3          5       1
4         10       1
5         42       2
6         45       2
7         92       3
8        156       4
9        160       4
10       162       4
11       163       4
12       164       4
13       200       5
14       203       5
A. Webb
  • 26,227
  • 1
  • 63
  • 95
0

If the name of your data.frame is "df"

df$segment[1] <- 1

for (i in 2:nrow(df)) {
    if (df$timestamp[i] < (df$timestamp[i-1] + 30)) {
        df$segment[i] <- df$segment[i-1]
    } else {
        df$segment[i] <- (df$segment[i-1] + 1)
    }
}
mshum
  • 257
  • 3
  • 15