6

I have a vector as the following:

example <- c(1, 2, 3, 8, 10, 11)

And I am trying to write a function that returns an output as the one you would get from:

desired_output <- list(first_sequence = c(1, 2, 3), 
                       second_sequence = 8, 
                       third_sequence = c(10, 11)
                       )

Actually, what I want is to count how many sequences as of those there are in my vector, and the length of each one. It just happens that a list as the one in "desired_ouput" would be sufficient.

The finality is to construct another vector, let's call it "b", that contains the following:

b <- c(3, 3, 3, 1, 2, 2)

The real world problem behind this is to measure the height of 3d objects contained in a 3D pointcloud.

I've tried to program both a function that returns the list in "example_list" and a recursive function that directly outputs vector "b", succeeded at none.

Someone has any idea? Thank you very much.

  • https://stackoverflow.com/questions/71594430/how-to-find-where-the-interval-of-continuous-numbers-starts-and-ends – rawr Apr 26 '22 at 18:19
  • 3
    Canonical for the `diff` - `cumsum` idiom: [Create grouping variable for consecutive sequences and split vector](https://stackoverflow.com/questions/5222061/create-grouping-variable-for-consecutive-sequences-and-split-vector) – Henrik Apr 26 '22 at 18:21
  • Does this answer your question? [Create grouping variable for consecutive sequences and split vector](https://stackoverflow.com/questions/5222061/create-grouping-variable-for-consecutive-sequences-and-split-vector) – DaveArmstrong Apr 26 '22 at 23:13

4 Answers4

7

We can split to a list by creating a grouping by difference of adjacent elements

out <- split(example, cumsum(c(TRUE, abs(diff(example)) != 1)))

Then, we get the lengths and replicate

unname(rep(lengths(out), lengths(out)))
[1] 3 3 3 1 2 2
akrun
  • 874,273
  • 37
  • 540
  • 662
5

You could do:

out <- split(example, example - seq_along(example))

To get the lengths:

ln <- unname(lengths(out))
rep(ln, ln)
[1] 3 3 3 1 2 2
jay.sf
  • 60,139
  • 8
  • 53
  • 110
Onyambu
  • 67,392
  • 3
  • 24
  • 53
  • Nope, this fails, try `ex <- c(1, 2, 11, 8, 3, 10);split(ex, ex - seq_along(ex))`. – jay.sf Apr 26 '22 at 19:02
  • @jay.sf your example is not sequential. You need an ordered sequence for this to work. the example given deals with positive ordered sequence of integers. In the case it is not ordered then this will fail. – Onyambu Apr 26 '22 at 19:55
  • I'm not sure if heights in a 3D point cloud are always monotonically increasing, though. If so, your code is of course great. – jay.sf Apr 26 '22 at 20:10
4

Here is one more. Not elegant but a different approach:

  1. Create a dataframe of the example vector
  2. Assign the elements to groups
  3. aggregate with tapply
example_df <- data.frame(example = example)

example_df$group <- cumsum(ifelse(c(1, diff(example) - 1), 1, 0))

tapply(example_df$example, example_df$group, function(x) x)
$`1`
[1] 1 2 3

$`2`
[1] 8

$`3`
[1] 10 11
TarJae
  • 72,363
  • 6
  • 19
  • 66
3

One other option is to use ave:

ave(example, cumsum(c(1, diff(example) != 1)), FUN = length)
# [1] 3 3 3 1 2 2

#or just 
ave(example, example - seq(example), FUN = length)
Maël
  • 45,206
  • 3
  • 29
  • 67