23

I have a vector in R,

a = c(2,3,4,9,10,2,4,19)

let us say I want to efficiently insert the following vectors, b, and c,

b = c(2,1)
d = c(0,1)

right after the 3rd and 7th positions (the "4" entries), resulting in,

e = c(2,3,4,2,1,9,10,2,4,0,1,19)

How would I do this efficiently in R, without recursively using cbind or so.

I found a package R.basic but its not part of CRAN packages so I thought about using a supported version.

pglpm
  • 516
  • 4
  • 14
user2805568
  • 297
  • 1
  • 3
  • 11

6 Answers6

17

Try this:

result <- vector("list",5)
result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (c(3,7)+1)))
result[c(FALSE,TRUE)] <- list(b,d)
f <- unlist(result)

identical(f, e)
#[1] TRUE

EDIT: generalization to arbitrary number of insertions is straightforward:

insert.at <- function(a, pos, ...){
    dots <- list(...)
    stopifnot(length(dots)==length(pos))
    result <- vector("list",2*length(pos)+1)
    result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos+1)))
    result[c(FALSE,TRUE)] <- dots
    unlist(result)
}


> insert.at(a, c(3,7), b, d)
 [1]  2  3  4  2  1  9 10  2  4  0  1 19

> insert.at(1:10, c(4,7,9), 11, 12, 13)
 [1]  1  2  3  4 11  5  6  7 12  8  9 13 10

> insert.at(1:10, c(4,7,9), 11, 12)
Error: length(dots) == length(pos) is not TRUE

Note the bonus error checking if the number of positions and insertions do not match.

Ferdinand.kraft
  • 12,579
  • 10
  • 47
  • 69
  • could you explain me the logic of the 5? I am trying to generalize it to any combination but i am having a hard time, getting many multiple of type errors... thanks... – user2805568 Sep 23 '13 at 03:44
  • @user2805568 It's the number of pieces put together: 5 = (number of insertions)*2 + 1. – Frank Sep 23 '13 at 03:54
  • 4
    @user2805568 I'm curious why you accepted an answer that didn't agree with the desired result? – IRTFM Sep 23 '13 at 04:13
12

You can use the following function,

ins(a, list(b, d), pos=c(3, 7))
# [1]  2  3  4  2  1  9 10  2  4  0  1  4 19

where:

ins <- function(a, to.insert=list(), pos=c()) {

  c(a[seq(pos[1])], 
    to.insert[[1]], 
    a[seq(pos[1]+1, pos[2])], 
    to.insert[[2]], 
    a[seq(pos[2], length(a))]
    )
}
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
6

Here's another function, using Ricardo's syntax, Ferdinand's split and @Arun's interleaving trick from another question:

ins2 <- function(a,bs,pos){
    as <- split(a,cumsum(seq(a)%in%(pos+1)))
    idx <- order(c(seq_along(as),seq_along(bs)))
    unlist(c(as,bs)[idx])
}

The advantage is that this should extend to more insertions. However, it may produce weird output when passed invalid arguments, e.g., with any(pos > length(a)) or length(bs)!=length(pos).

You can change the last line to unname(unlist(... if you don't want a's items named.

Community
  • 1
  • 1
Frank
  • 66,179
  • 8
  • 96
  • 180
5

The straightforward approach:

b.pos <- 3
d.pos <- 7
c(a[1:b.pos],b,a[(b.pos+1):d.pos],d,a[(d.pos+1):length(a)])
[1]  2  3  4  2  1  9 10  2  4  0  1 19

Note the importance of parenthesis for the boundaries of the : operator.

Itamar
  • 2,111
  • 13
  • 16
  • 2
    turn into function with: `insert_vec <- function(old, new, loc) c(old[1:loc], new, old[-c(1:loc)])` – s_baldur Sep 20 '18 at 16:07
3

After using Ferdinand's function, I tried to write my own and surprisingly it is far more efficient.
Here's mine :

insertElems = function(vect, pos, elems) {

l = length(vect)
  j = 0
  for (i in 1:length(pos)){
    if (pos[i]==1)
      vect = c(elems[j+1], vect)
    else if (pos[i] == length(vect)+1)
      vect = c(vect, elems[j+1])
    else
      vect = c(vect[1:(pos[i]-1+j)], elems[j+1], vect[(pos[i]+j):(l+j)])
    j = j+1
  }
  return(vect)
}

tmp = c(seq(1:5))
insertElems(tmp, c(2,4,5), c(NA,NA,NA))
# [1]  1 NA  2  3 NA  4 NA  5

insert.at(tmp, c(2,4,5), c(NA,NA,NA))
# [1]  1 NA  2  3 NA  4 NA  5

And there's the benchmark result :

> microbenchmark(insertElems(tmp, c(2,4,5), c(NA,NA,NA)), insert.at(tmp, c(2,4,5), c(NA,NA,NA)), times = 10000)
Unit: microseconds
                                        expr    min     lq     mean median     uq      max neval
 insertElems(tmp, c(2, 4, 5), c(NA, NA, NA))  9.660 11.472 13.44247  12.68 13.585 1630.421 10000
   insert.at(tmp, c(2, 4, 5), c(NA, NA, NA)) 58.866 62.791 70.36281  64.30 67.923 2475.366 10000

my code works even better for some cases :

> insert.at(tmp, c(1,4,5), c(NA,NA,NA))
# [1]  1  2  3 NA  4 NA  5 NA  1  2  3
# Warning message:
# In result[c(TRUE, FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos))) :
#   number of items to replace is not a multiple of replacement length

> insertElems(tmp, c(1,4,5), c(NA,NA,NA))
# [1] NA  1  2  3 NA  4 NA  5
2

Here's an alternative that uses append. It's fine for small vectors, but I can't imagine it being efficient for large vectors since a new vector is created upon each iteration of the loop (which is, obviously, bad). The trick is to reverse the vector of things that need to be inserted to get append to insert them in the correct place relative to the original vector.

a = c(2,3,4,9,10,2,4,19)
b = c(2,1)
d = c(0,1)

pos <- c(3, 7)
z <- setNames(list(b, d), pos)
z <- z[order(names(z), decreasing=TRUE)]


for (i in seq_along(z)) {
  a <- append(a, z[[i]], after = as.numeric(names(z)[[i]]))
}

a
#  [1]  2  3  4  2  1  9 10  2  4  0  1 19
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485