What exactly happens when looping "result_vector = c(result_vector, new_value)"

Question

I'm new to programming and using R these days, and to concatenate new values to a result vector, I have been using

values = sample(letters, 1e4, replace=TRUE)
result_vector = NULL
for (i in 1:length(values)) result_vector = c(result_vector, values[i])

and recently I found myself pitiful when I measured the above,

result_vector = NULL
system.time( for (i in 1:length(values)) result_vector = c(result_vector, values[i]))

which gave me

   user  system elapsed 
  0.288   0.016   0.333

against an alternative,

result_vector = character(length(values))
system.time( for (i in 1:length(values)) result_vector[i] = values[i])

which gave me

   user  system elapsed 
  0.004   0.000   0.011

To learn from this enlightenment, I would like to ask what exactly happens when

result_vector = c(result_vector, new_value)

Is it reallocating a new space for result_vector every iteration, which causes a lot of time?

This question has some good insight for you: https://stackoverflow.com/questions/7142767/why-are-loops-slow-in-r — Chase, Mar 30 '19 at 03:36

score 1 · Accepted Answer · answered Mar 30 '19 at 03:44

For each iteration function "c" it is reallocating new space and append the extra argments to its first, for your example it apends new_value to result_vector reallocating result_vector fir one extra value.

General try to avoid this because it is no a good practise. Although there are some cases that this way is the only one.

What exactly happens when looping "result_vector = c(result_vector, new_value)"

1 Answers1