0

I am looking to randomly generate a vector of numbers in R, with a specific sum but which also has a restriction for some specific members of the generated vector, e.g. that the 4th number (say, in a vector of 5) cannot exceed 50.

I am doing this within a for loop with millions of iterations in order to simulate election vote changes, where I am adding votes to one party and taking them away from other parties with equal probability. However, my issue is that in many iterations, votes turn out to be negative, which is illogical. I have figured out how to do the "sums up to X" part from other answers here, and I have made a workaround for the second restriction as follows:

 parties <- data.table(party = c("red", "green", "blue", "brown", "yellow"), votes = c(657, 359, 250, 80, 7))
    votes_to_reallocate <- 350
    immune_party <- "green"

    parties_simulation <- copy(parties)
    
    parties_simulation[party != immune_party, 
                         votes := votes - as.vector(rmultinom(1, size=votes_to_reallocate, prob=rep(1, nrow(parties)-1)))
                         ]
# Most likely there are negative votes for the last party, perhaps even the last two.
# While loop is supposed to correct this
    
    while (any(parties_simulation[, votes]<0)) {
        negative_parties <- parties_simulation[votes < 0, party]
        for (i in seq_along(negative_parties)) {
            votes_to_correct <- parties_simulation[party == negative_parties[i], abs(votes)]
            parties_to_change <- parties_simulation[party != immune_party & !party %in% negative_parties, .N]
            parties_simulation[party != immune_party & !party %in% negative_parties, 
                               votes := votes - as.vector(rmultinom(1, size=votes_to_correct, prob=rep(1, parties_to_change)))
            ]
            parties_simulation[party == negative_parties[i], votes := votes + votes_to_correct]
            }
        }

However, this seems to be a huge bottleneck as each simulation has to be corrected by the while loop. I am curious as to whether there is a solution for this that would generate the random numbers with the restriction already imposed (for instance, generate 4 random numbers, adding up to 350, and with the fourth number not exceeding 7). If not, perhaps there is a more efficient way to solve this?

Lauris
  • 3
  • 1
  • 1
    Does this answer your question? [Is there an efficient way to generate N random integers in a range that have a given sum or average?](https://stackoverflow.com/questions/61393463/is-there-an-efficient-way-to-generate-n-random-integers-in-a-range-that-have-a-g) – Peter O. Jan 30 '21 at 20:25
  • In particular, see: https://stackoverflow.com/a/61525097/815724 – Peter O. Jan 30 '21 at 20:25

1 Answers1

0

Maybe I'm missing something, but would this work:

const_rng <- function(n, const, total){
  consts <- sapply(const, function(x)sample(1:x, 1))
  rest <- rmultinom(1, total - sum(consts), prob = rep(1/(n-length(consts)), (n-length(consts))))
  res <- rep(NA, n)
  res[as.numeric(names(const))] <- consts
  res[-as.numeric(names(const))] <- rest
  return(res)
}

out <- const_rng(5, const=c("4" = 7), 350)
out
# [1] 90 76 88  5 91
sum(out)
# [1] 350

First, it draws the constrained values from the integers 1:const. Then it draws the remainder total - the sum of the constrained draws) from a multinomial distribution giving each other outcome equal probability. The const argument is specified by a vector where the name is the observation number to be constrained and the value is the upper bound of the draw. For example const = c("4" = 7) means constrain the fourth point to be between 0 and 7.

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25