3

How can I generate two sets of random numbers with different set size in R which the summation of two sets are equal to each other? For example I want to generate two sets of random numbers called X and Y

X <- runif(15, min=0, max=20)
Y <- runif(10, min=0, max=20)

with a constraint that

sum(X) == sum(Y)
Alireza
  • 31
  • 3
  • The following might be of interest to you : https://www.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum – kvantour May 15 '18 at 07:26
  • Can someone explain why this is a duplicate of the linked answer? `This question has been asked before and already has an answer. ` - this is not true! @loki – kangaroo_cliff May 17 '18 at 01:14

2 Answers2

1

You could use a kind of rejection sampling:

a <- 15
b <- 10

set.seed(42) #for reproducibility
n <- 0 #counter
repeat {
  n <- n + 1
  X <- runif(a, min=0, max=20)
  Y <- runif(b - 1, min=0, max=20)
  d <- sum(X) - sum(Y)
  if (d >= 0 && d<= 20) break
}
Y <- c(Y, d)

sum(X) == sum(Y)
#[1] TRUE

n
#[1] 11

More efficient algorithms might exist. I'm also not sure if this has the right kind of randomness for your application (whatever that might be), especially regarding the last value of Y (i.e., d). Maybe ask on stats.stackexchange.com or on math.stackexchange.com.

Roland
  • 127,288
  • 10
  • 191
  • 288
0

I think the following should be good as well. Since we know that X must contain 10 smaller elements compared to those in Y there doesn't seem to be the need to reject.

a <- 15
b <- 10

set.seed(42) 
tmp1 <- runif(b, min=0, max=20)
tmp2 <- runif(b, min=0, max=20)

if (sum(tmp1) > sum(tmp2)) {
  Y <- tmp1 
  X <- tmp2
} else {
  Y <- tmp2 
  X <- tmp1
}
X <- c(X, runif(a - b, min=0, max=20))

if (sum(X) >= sum(Y)) {
  yind <- sample.int(b, 1)
  Y[yind] <- sum(X) - sum(Y[-yind])
} else {
  xind <- sample.int(a, 1)
  X[xind] <- sum(Y) - sum(X[-xind])
}

sum(X) == sum(Y)
  # [1] TRUE

Explanation of the algorithm.

  1. generate two vectors of the smaller length
  2. Assing the one that has the larger sum to Y since it is shorter.
  3. Generate the remainder of X
  4. If sum(X) > sum(Y), select an element of Y randomly and make sum(X) = sum(Y), if not pick an element of X for this.
kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42