4

In the R programming language, how do I get a dynamic array (as described on Wikipedia) or equivalent data structure? I want something with the following attributes:

  1. O(1) indexing.

  2. Amortized O(1) appending.

  3. O(N) or less wasted space.

  4. Type parametric, i.e. can hold lists, user-defined objects, functions, matrices, etc., not just numbers.

  5. Appending without naming should be supported. Therefore, using an environment won't cut it.

From what I can tell, using a list doesn't work because appending to one the following way takes O(N) time, not amortized O(1):

foo <- list()
foo[[length(foo) + 1]] <- 1
Community
  • 1
  • 1
dsimcha
  • 67,514
  • 53
  • 213
  • 334
  • 1
    As far as I know, there's not something like this built into base R. Is it for a specific use case? If so, perhaps there's a more R-ish solution that would meet your needs. – Aaron left Stack Overflow Nov 22 '11 at 02:47
  • @Aaron: No specific use case. I asked because I'm trying to understand R's capabilities as a general-purpose high-level language. I often do projects that require some statistics/machine learning and some general purpose programming. I usually do these in a general purpose language with a plain old library for statistics/machine learning instead of R. Every time I've tried to use it, R has felt too domain specific. I'm trying to see if I'm missing something and R might work well enough for the general purpose stuff to use in some of my more statistics heavy projects. – dsimcha Nov 22 '11 at 04:29
  • From the perspective of a general programmer, your assessment is probably correct. R was originally built for statisticians and while it's been getting more like a general-purpose language as it matures, I suspect CS folks will still feel it's missing things. I will be curious to see what happens to it in the next few years; it's already getting a little more like a regular programming language than I'm comfortable with (I'm thinking specifically of S4 methods). – Aaron left Stack Overflow Nov 22 '11 at 14:38
  • 1
    Also, I see you have some C++ experience; if you haven't already, you should check out [Dirk Eddelbuettel](http://stackoverflow.com/users/143305/dirk-eddelbuettel)'s [Rcpp](http://dirk.eddelbuettel.com/code/rcpp.html) package, which makes integrating C++ code in R much easier. – Aaron left Stack Overflow Nov 22 '11 at 14:40

1 Answers1

3

Instead of appending to the list each time, preallocate it with a fixed length. Then when the list is full, double it, as per the description on the Wikipedia article. This should give you the performance you're after.

foo <- vector("list", 1000)

# populate the list, with N >> 1000...
for(i in seq(N))
{
    foo[[i]] <- ...

    # if the list is full, extend it
    if(i == length(foo))
        foo <- c(foo, vector("list", length(foo)))
}
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • 1
    Makes sense. I thought of doing it manually after I posted, but I'm just surprised there's no obvious, builtin way to do something like this. Why the heck is the O(N) behavior the default? – dsimcha Nov 22 '11 at 04:21