what is time complexity of append (vector / list)?

Question

https://www.programmingr.com/fast-r-append-list/

I don't find the time complexity of the appending operation on a vector or a list.

Is this documented anywhere?

Suppose that I have n elements, that I have to append one by one. Is there a data structure that has a time complexity of n to append the n elements?

If you know you *"have n elements,that you have to append one by one"*, why append one by one? Create a vector of `n` elements of the same type beforehand. — Rui Barradas, Aug 05 '21 at 14:59
See context here. https://stackoverflow.com/questions/68669493/read-a-tsv-file-group-by-group-efficiently — user1424739, Aug 05 '21 at 15:31

score 2 · Answer 1 · edited Feb 01 '23 at 19:56

Here is a function to test the running times of three different ways of assigning a value to a vector.
The first two extend the vector by appending the new element to its end. The last one allocates space by creating a vector with n elements and then populates the vector with the new elements.
Times are measured with package microbenchmark and medians are plotted with ggplot2 graphics.

Atomic vector timings

fun <- function(N = 1:10, times = 100){
  if(!require(microbenchmark)){
    stop("Package 'microbenchmark' is not installed")
  }
  out <- lapply(2^N, function(n){
    mb <- microbenchmark(
      c = {
        x <- NULL
        for(i in seq_len(n)) x <- c(x, i)
      },
      append = {
        x <- NULL
        for(i in seq_len(n)) x <- append(x, i)
      },
      allocate = {
        x <- numeric(n)
        for(i in seq_len(n)) x[i] <- i
      },
      times = times
    )
    mb$n <- n
    aggregate(time ~ expr + n, mb, median)
  })
  out <- do.call(rbind, out)
  row.names(out) <- NULL
  out
}

library(ggplot2)

times <- fun(1:12)

ggplot(times, aes(n, time, color = expr)) +
  geom_line() +
  geom_point() +
  theme_bw()

`list` timings

This function is nearly identical to the function above, but creating and extending objects of class "list".

Surprisingly, now to extend the list by assigning new elements after the current last is faster than append.

funList <- function(N = 1:10, times = 100){
  if(!require(microbenchmark)){
    stop("Package 'microbenchmark' is not installed")
  }
  out <- lapply(2^N, function(n){
    mb <- microbenchmark(
      List = {
        x <- list()
        for(i in seq_len(n)) x[[i]] <- i
      },
      append = {
        x <- list()
        for(i in seq_len(n)) x <- append(x, i)
      },
      allocate = {
        x <- vector("list", n)
        for(i in seq_len(n)) x[[i]] <- i
      },
      times = times
    )
    mb$n <- n
    aggregate(time ~ expr + n, mb, median)
  })
  out <- do.call(rbind, out)
  row.names(out) <- NULL
  out
}

library(ggplot2)

times <- funList(1:12)

ggplot(times, aes(n, time, color = expr)) +
  geom_line() +
  geom_point() +
  theme_bw()

@user1424739 See also [this SO post](https://stackoverflow.com/questions/2436688/append-an-object-to-a-list-in-r-in-amortized-constant-time-o1?rq=1). — Rui Barradas, Aug 05 '21 at 16:55
Do I understand that post correctly? `a <- list(a, list(i))` is the fastest for list? But it does not make sense as it involves rewriting the whole list. Intuitively, assigning by index should be the fastest. Why rewriting the original list could be faster? — user1424739, Aug 05 '21 at 17:00
@user1424739 The post is right, that way is the fastest, it's half the time of `append` but the result is not the same, it is a nested lists list, see [this comment](https://stackoverflow.com/questions/2436688/append-an-object-to-a-list-in-r-in-amortized-constant-time-o1?rq=1#comment57695097_29870770) to Dirk's answer.. — Rui Barradas, Aug 05 '21 at 17:03
@user1424739 To get an `identical()` result, it would have to be `a <- c(a, list(i))` and performance degrades, as seen in my answer. — Rui Barradas, Aug 05 '21 at 17:09

what is time complexity of append (vector / list)?

1 Answers1

Atomic vector timings

`list` timings

Linked

what is time complexity of append (vector / list)?

1 Answers1

Atomic vector timings

list timings

Linked

`list` timings