From previous questions I've learned that pairlist
is a type of linked-list in R (see here).
So it is true in the sense that getting the length of a pairlist
takes MUCH longer than getting the length of a list
. Hypothetically, it should take longer to insert or remove an element in a list than in pairlist (since pairlist is a linked list, is it not?). But this is also slower.
For example:
list1 <- as.list(1:1e4)
pairlist1 <- as.pairlist(list1)
library(microbenchmark)
microbenchmark(length(list1), length(pairlist1)) # this is indeed a linked list! (it is much slower)
microbenchmark(list1[-1], pairlist1[-1]) # why is list faster?
And the output is:
> microbenchmark(length(list1), length(pairlist1)) # this is indeed a linked list! (it is much slower)
Unit: nanoseconds
expr min lq mean median uq max neval cld
length(list1) 0 0 46.85 0 1 3105 100 a
length(pairlist1) 62388 62389 70808.86 62698 68130 190888 100 b
Warning message:
In microbenchmark(length(list1), length(pairlist1)) :
Could not measure a positive execution time for 21 evaluations.
> microbenchmark(list1[-1], pairlist1[-1]) # why is list faster?
Unit: microseconds
expr min lq mean median uq max neval cld
list1[-1] 76.045 84.1155 111.0971 88.1500 96.9965 1882.807 100 a
pairlist1[-1] 238.998 257.7765 356.8178 266.6225 299.6790 3281.719 100 b
The same is true about the size of the object. For example:
library(pryr)
sizes <- sapply(0:50, function(n) object_size(as.pairlist(seq_len(n))))
plot(0:50, sizes, xlab = "Length", ylab = "Size (bytes)", type = "s")
#sizes2 <- sapply(0:50, function(n) object_size((seq_len(n))))
sizes2 <- sapply(0:50, function(n) object_size(as.list(seq_len(n))))
lines(0:50, sizes2, xlab = "Length", ylab = "Size (bytes)", type = "s", col = 2)
legend("topleft", fill = c(1,2), legend = c("pairlist", "list") )
So my questions are: 1. why is it slower to insert/remove an element? (/ is there a way to make it faster in pairlist than in a list?) 2. Are there cases in which an operation on a pairlist will outperform a list?