2

Given a vector as c(1,3,4,5,7,8,9,10), how to split it into different trunks with the continuous ones as a cluster, the desired result is c(1), c(3,4,5), c(7,8,9,10)?

rules:split the numbers into different clusters; each cluster only include continuous ordered numbers.

The clusters and how many numbers in each cluster are issued as followed, but how to split and get the result with trunks? or other methods? Any help would be appreciated.

library(tidyverse)

num <- c(1,3,4,5,7,8,9,10)

num_seq <- seq(min(num), max(num))

chunks <- num_seq %in% num %>% 
  as.character() %>% 
  paste(collapse = " ") %>% 
  str_split("FALSE") %>%
  unlist() %>% 
  as.list() %>% 
  map(.f = ~str_count(., "TRUE"))

[[1]]
[1] 1

[[2]]
[1] 3

[[3]]
[1] 4
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
Lee Jim
  • 365
  • 3
  • 16

2 Answers2

4

A base solution: You can use diff()+cumsum() to determine where the sequence is consecutive.

num <- c(1,3,4,5,7,8,9,10)

split(num, cumsum(c(TRUE, diff(num) != 1)))

# $`1`
# [1] 1
# 
# $`2`
# [1] 3 4 5
# 
# $`3`
# [1]  7  8  9 10
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
3

This data.table solution works, but it is generally not as fast as Darren's base split solution:

library(data.table)
num <- c(1,3,4,5,7,8,9,10)
data.table(num)[, .(chunks = .(num)), cumsum(num - shift(num, fill = -Inf) > 1)]$chunks
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 3 4 5
#> 
#> [[3]]
#> [1]  7  8  9 10
jblood94
  • 10,340
  • 1
  • 10
  • 15