Split a vector by its sequences

Question

The following vector x contains the two sequences 1:4 and 6:7, among other non-sequential digits.

x <- c(7, 1:4, 6:7, 9)

I'd like to split x by its sequences, so that the result is a list like the following.

# [[1]]
# [1] 7
#
# [[2]]
# [1] 1 2 3 4
#
# [[3]]
# [1] 6 7
#
# [[4]]
# [1] 9

Is there a quick and simple way to do this?

I've tried

split(x, c(0, diff(x)))

which gets close, but I don't feel like appending 0 to the differenced vector is the right way to go. Using findInterval didn't work either.

score 16 · Accepted Answer · answered Sep 11 '14 at 17:58

16

split(x, cumsum(c(TRUE, diff(x)!=1)))
#$`1`
#[1] 7
#
#$`2`
#[1] 1 2 3 4
#
#$`3`
#[1] 6 7
#
#$`4`
#[1] 9

answered Sep 11 '14 at 17:58

Roland

127,288
10
191
288

Can you explain how the diff() function works and what it is doing in this solution? The official documentation on the diff() function did not help me understand it. – OnlyDean Jun 21 '18 at 14:36
The function simply calculates all differences between consecutive vector elements. E.g., compare `print(x <- (1:5)^2)` with `diff(x)`. Since OP defined sequences as values having a difference of exactly one, I check for differences different from one. Check out (with OP's data) `diff(x); diff(x)!=1; cumsum(c(TRUE, diff(x)!=1))`. – Roland Jun 21 '18 at 14:45

score 1 · Answer 2 · edited May 23 '17 at 12:32

Just for fun, you can make use of Carl Witthoft's seqle function from his "cgwtools" package. (It's not going to be anywhere near as efficient as Roland's answer.)

library(cgwtools)

## Here's what seqle does...
## It's like rle, but for sequences
seqle(x)
# Run Length Encoding
#   lengths: int [1:4] 1 4 2 1
#   values : num [1:4] 7 1 6 9

y <- seqle(x)
split(x, rep(seq_along(y$lengths), y$lengths))
# $`1`
# [1] 7
# 
# $`2`
# [1] 1 2 3 4
# 
# $`3`
# [1] 6 7
# 
# $`4`
# [1] 9

Split a vector by its sequences

2 Answers2

Linked

Related