Rcpp R vector size limit (negative length vectors are not allowed)

Question

As per https://stackoverflow.com/a/48676389/3846213 and https://stackoverflow.com/a/5234293/3846213 R vectors are limited 2^31 - 1 items. However, I've been able to trigger the "negative length vectors are not allowed" error (which is supposed to be a sign of trying to allocate an overly large vector) at half that number via Rcpp. This all comes from my trying to debug an Rcpp-based R package (https://github.com/tpq/propr/issues/13).

library(Rcpp)

cppFunction("
IntegerVector test(int size) {
    int veclen = size * (size - 1) / 2;
    IntegerVector vec(veclen);
    return vec;
}
")

vec <- test(47000)
Error in test(47000) : negative length vectors are not allowed

47000^2 / 2 is nearly half of 2^31. I've got no such problems in pure R, that is vec <- 1:(47000*(47000-1)/2) runs fine, so there should be something special with Rcpp.

No, @duckmayr is right. I can replicate your problem. But it is not just an overflow. We have to keep looking. — Jan, Jun 27 '20 at 18:36
@Jan it actually is an overflow, but just due to order of operations; see my answer below — duckmayr, Jun 27 '20 at 18:42

duckmayr · Accepted Answer · 2020-06-29T11:51:47.753

The problem is multiplication overflow. When you do

size * (size - 1) / 2

order of operations bites you, because

size * (size - 1)

can overflow even if the overall expression doesn't. We can see this by adding a printing statement:

IntegerVector test(int size) {
    int veclen = size * (size - 1) / 2;
    Rcpp::Rcout << veclen << std::endl;
    IntegerVector vec(veclen);
    return vec;
}

vec <- test(47000)
# -1043007148

So, we can fix it by changing up how we do that operation:

IntegerVector test(int size) {
    int veclen = (size / 2) * (size - 1);
    Rcpp::Rcout << veclen << std::endl;
    IntegerVector vec(veclen);
    return vec;
}

which gives no issue

vec <- test(47000)
# 1104476500
str(vec)
# int [1:1104476500] 0 0 0 0 0 0 0 0 0 0 ...

Update: The problem with odd numbers

Eli Korvigo brings up an excellent point in the comments about integer division behavior with odd numbers. To illustrate consider calling the function with the even number 4 and the odd number 5

even <- 4
odd  <- 5

even * (even - 1) / 2
# [1] 6
odd  * (odd  - 1) / 2
# [1] 10

It should create vectors of length 6 and 10 respectively. But, what happens?

test(4)
# 6
# [1] 0 0 0 0 0 0
test(5)
# 8
# [1] 0 0 0 0 0 0 0 0

Oh no! 5 / 2 in integer division is 2, not 2.5, so this does not quite do what we want in the odd case. However, luckily we can easily address this with a simple flow control:

IntegerVector test2(int size) {
    int veclen;
    if ( size % 2 == 0 ) {
        veclen = (size / 2) * (size - 1);
    } else {
        veclen = size * ((size - 1) / 2);
    }
    Rcpp::Rcout << veclen << std::endl;
    IntegerVector vec(veclen);
    return vec;
}

We can see this handles the odd and even cases both just fine:

test2(4)
# 6
# [1] 0 0 0 0 0 0
test2(5)
# 10
# [1] 0 0 0 0 0 0 0 0 0 0

Thanks for reaffirming the doubts I had about that arithmetic expression. — Eli Korvigo, Jun 27 '20 at 19:56
Unfortunately, the reordering gets you into another sort of trouble: integer division. `veclen = size * (size - 1) / 2;` evaluates the same for odd and even integers. However, `(size / 2) * (size - 1)` behaves differently for odd and even numbers. — Eli Korvigo, Jun 28 '20 at 20:48
@EliKorvigo Excellent point! I'm sorry I overlooked this initially. I've addressed it in an update to the answer. — duckmayr, Jun 29 '20 at 11:52

Rcpp R vector size limit (negative length vectors are not allowed)

1 Answers1

Update: The problem with odd numbers