Generate random numbers in R with constraint

Question

How can I generate two sets of random numbers with different set size in R which the summation of two sets are equal to each other? For example I want to generate two sets of random numbers called X and Y

X <- runif(15, min=0, max=20)
Y <- runif(10, min=0, max=20)

with a constraint that

sum(X) == sum(Y)

The following might be of interest to you : https://www.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum — kvantour, May 15 '18 at 07:26
Can someone explain why this is a duplicate of the linked answer? `This question has been asked before and already has an answer. ` - this is not true! @loki — kangaroo_cliff, May 17 '18 at 01:14

Roland · Answer 1 · 2018-05-15T06:40:06.100

You could use a kind of rejection sampling:

a <- 15
b <- 10

set.seed(42) #for reproducibility
n <- 0 #counter
repeat {
  n <- n + 1
  X <- runif(a, min=0, max=20)
  Y <- runif(b - 1, min=0, max=20)
  d <- sum(X) - sum(Y)
  if (d >= 0 && d<= 20) break
}
Y <- c(Y, d)

sum(X) == sum(Y)
#[1] TRUE

n
#[1] 11

More efficient algorithms might exist. I'm also not sure if this has the right kind of randomness for your application (whatever that might be), especially regarding the last value of Y (i.e., d). Maybe ask on stats.stackexchange.com or on math.stackexchange.com.

kangaroo_cliff · Answer 2 · 2018-05-15T07:21:54.767

I think the following should be good as well. Since we know that X must contain 10 smaller elements compared to those in Y there doesn't seem to be the need to reject.

a <- 15
b <- 10

set.seed(42) 
tmp1 <- runif(b, min=0, max=20)
tmp2 <- runif(b, min=0, max=20)

if (sum(tmp1) > sum(tmp2)) {
  Y <- tmp1 
  X <- tmp2
} else {
  Y <- tmp2 
  X <- tmp1
}
X <- c(X, runif(a - b, min=0, max=20))

if (sum(X) >= sum(Y)) {
  yind <- sample.int(b, 1)
  Y[yind] <- sum(X) - sum(Y[-yind])
} else {
  xind <- sample.int(a, 1)
  X[xind] <- sum(Y) - sum(X[-xind])
}

sum(X) == sum(Y)
  # [1] TRUE

Explanation of the algorithm.

generate two vectors of the smaller length
Assing the one that has the larger sum to Y since it is shorter.
Generate the remainder of X
If sum(X) > sum(Y), select an element of Y randomly and make sum(X) = sum(Y), if not pick an element of X for this.

Generate random numbers in R with constraint

2 Answers2