how to subset the values with the specific difference in R?

Question

I'm trying to subset some values with the specific difference in a vector. In the followed the vector, I wanna separate a vector into several ones with specific difference of 1. For example, A problem

a <- c(1, 1.2, 1.6, 2, 2.2, 2.6, 3, 3.2, 3.6, 4, 4.2, 4.6, 5, 5.2, 5.6, 6, 7, 8, 9, 10)

As a result,

 b <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
 c <- c(1.2, 2.2, 3.2, 4.2, 5.2)
 d <- c(1.6, 2.6, 3.6, 4.6, 5.6)

I tried to code a For loop, but I think it's not efficient and there is the better method for solving this problem.

You could try `split(a, round(a %% 1, 1))` but I don't think it will be very reliable. Generally, computers are not good at precisely matching numbers. See http://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal — Frank, Sep 19 '16 at 02:45

Psidom · Accepted Answer · 2016-09-19T03:05:13.247

An alternative recursive solution, for every recursion, extract a smallest value based vector and pass the remaining values for the next recursion:

my_split = function(vec, tol) { 
    if(length(vec) == 0) list() 
    else {
        mod1 <- (vec - min(vec))%%1

        # here we check both abs(mod1) and abs(mod1 - 1) since for example 
        # (4.6 - 3.6)%%1 == 1 due to the fact that 4.6 - 3.6 < 1
        splits <- split(vec, abs(mod1) < tol | abs(mod1 - 1) < tol)
        c(list(splits$`TRUE`), my_split(splits$`FALSE`, tol))
        }      
    }

my_split(a, 0.001)     # use a tolerance here to deal with the problem that floating number 
                       # can not be exactly represented

# [[1]]
# [1]  1  2  3  4  5  6  7  8  9 10

# [[2]]
# [1] 1.2 2.2 3.2 4.2 5.2

# [[3]]
# [1] 1.6 2.6 3.6 4.6 5.6

score 1 · Answer 2 · answered Sep 19 '16 at 02:46

Here you go:

a <- c(1, 1.2, 1.6, 2, 2.2, 2.6, 3, 3.2, 3.6, 4, 4.2, 4.6, 5, 5.2, 5.6, 6, 7, 8, 9, 10)

a_min = a[1]
a_max = a[length(a)]

h = a[a<(a_min+1)]

d = lapply(h, function(x){seq(x,a_max)[seq(x,a_max)%in%a]})

What does this do?
h stores all the elements between the first one and the same + 1
For each one of these create a sequence from it to the last element of a and only keep those that are in a.

The result is a list that contains each sequence:

> d
[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10

[[2]]
[1] 1.2 2.2 3.2 4.2 5.2

[[3]]
[1] 1.6 2.6 3.6 4.6 5.6

how to subset the values with the specific difference in R?

2 Answers2