The problem is multiplication overflow. When you do
size * (size - 1) / 2
order of operations bites you, because
size * (size - 1)
can overflow even if the overall expression doesn't.
We can see this by adding a printing statement:
IntegerVector test(int size) {
int veclen = size * (size - 1) / 2;
Rcpp::Rcout << veclen << std::endl;
IntegerVector vec(veclen);
return vec;
}
vec <- test(47000)
# -1043007148
So, we can fix it by changing up how we do that operation:
IntegerVector test(int size) {
int veclen = (size / 2) * (size - 1);
Rcpp::Rcout << veclen << std::endl;
IntegerVector vec(veclen);
return vec;
}
which gives no issue
vec <- test(47000)
# 1104476500
str(vec)
# int [1:1104476500] 0 0 0 0 0 0 0 0 0 0 ...
Update: The problem with odd numbers
Eli Korvigo brings up an excellent point in the comments about integer division behavior with odd numbers. To illustrate consider calling the function with the even number 4 and the odd number 5
even <- 4
odd <- 5
even * (even - 1) / 2
# [1] 6
odd * (odd - 1) / 2
# [1] 10
It should create vectors of length 6 and 10 respectively.
But, what happens?
test(4)
# 6
# [1] 0 0 0 0 0 0
test(5)
# 8
# [1] 0 0 0 0 0 0 0 0
Oh no!
5 / 2
in integer division is 2, not 2.5, so this does not quite do what we want in the odd case.
However, luckily we can easily address this with a simple flow control:
IntegerVector test2(int size) {
int veclen;
if ( size % 2 == 0 ) {
veclen = (size / 2) * (size - 1);
} else {
veclen = size * ((size - 1) / 2);
}
Rcpp::Rcout << veclen << std::endl;
IntegerVector vec(veclen);
return vec;
}
We can see this handles the odd and even cases both just fine:
test2(4)
# 6
# [1] 0 0 0 0 0 0
test2(5)
# 10
# [1] 0 0 0 0 0 0 0 0 0 0