Generate random numbers with a given distribution

Question

Check out this question:

Swift probability of random number being selected?

The top answer suggests to use a switch statement, which does the job. However, if I have a very large number of cases to consider, the code looks very inelegant; I have a giant switch statement with very similar code in each case repeated over and over again.

Is there a nicer, cleaner way to pick a random number with a certain probability when you have a large number of probabilities to consider? (like ~30)

If you have similar cases you can write all similar cases in a single case by providing comma operator. — Amit89, May 18 '15 at 17:36
@Amit89 They are similar, but not identical. All cases only return a value, but return different values. — asdf, May 18 '15 at 17:43

Martin R · Accepted Answer · 2019-10-24T07:34:44.467

36

This is a Swift implementation strongly influenced by the various answers to Generate random numbers with a given (numerical) distribution.

For Swift 4.2/Xcode 10 and later (explanations inline):

func randomNumber(probabilities: [Double]) -> Int {

    // Sum of all probabilities (so that we don't have to require that the sum is 1.0):
    let sum = probabilities.reduce(0, +)
    // Random number in the range 0.0 <= rnd < sum :
    let rnd = Double.random(in: 0.0 ..< sum)
    // Find the first interval of accumulated probabilities into which `rnd` falls:
    var accum = 0.0
    for (i, p) in probabilities.enumerated() {
        accum += p
        if rnd < accum {
            return i
        }
    }
    // This point might be reached due to floating point inaccuracies:
    return (probabilities.count - 1)
}

Examples:

let x = randomNumber(probabilities: [0.2, 0.3, 0.5])

returns 0 with probability 0.2, 1 with probability 0.3, and 2 with probability 0.5.

let x = randomNumber(probabilities: [1.0, 2.0])

return 0 with probability 1/3 and 1 with probability 2/3.

For Swift 3/Xcode 8:

func randomNumber(probabilities: [Double]) -> Int {

    // Sum of all probabilities (so that we don't have to require that the sum is 1.0):
    let sum = probabilities.reduce(0, +)
    // Random number in the range 0.0 <= rnd < sum :
    let rnd = sum * Double(arc4random_uniform(UInt32.max)) / Double(UInt32.max)
    // Find the first interval of accumulated probabilities into which `rnd` falls:
    var accum = 0.0
    for (i, p) in probabilities.enumerated() {
        accum += p
        if rnd < accum {
            return i
        }
    }
    // This point might be reached due to floating point inaccuracies:
    return (probabilities.count - 1)
}

For Swift 2/Xcode 7:

func randomNumber(probabilities probabilities: [Double]) -> Int {

    // Sum of all probabilities (so that we don't have to require that the sum is 1.0):
    let sum = probabilities.reduce(0, combine: +)
    // Random number in the range 0.0 <= rnd < sum :
    let rnd = sum * Double(arc4random_uniform(UInt32.max)) / Double(UInt32.max)
    // Find the first interval of accumulated probabilities into which `rnd` falls:
    var accum = 0.0
    for (i, p) in probabilities.enumerate() {
        accum += p
        if rnd < accum {
            return i
        }
    }
    // This point might be reached due to floating point inaccuracies:
    return (probabilities.count - 1)
}

edited Oct 24 '19 at 07:34

answered May 18 '15 at 17:58

Martin R

529,903
94
1,240
1,382

Thank you. I'm not sure what you mean by "floating point inaccuracies" - could you please expand on that ? – asdf May 18 '15 at 18:03
@asdf: Since `rnd` is less than `sum`, the condition `if rnd < accum` should be true for some `i`. But binary floating point numbers cannot represent all values exactly, therefore the loop might terminate. In that case the last index is returned. – Martin R May 18 '15 at 18:14
@fqdn: Thanks for the correction, that should be better! – However, your statement *"the original `arc4random_uniform(UInt32.max - 1)` could return -1"* is not correct, it returns a number between 0 and UInt32.max - 2 :) – Martin R Jul 07 '15 at 17:12
@MartinR heh right, good call :) was rushing to get that edit in, and it's a little laborious to go back and proof-read the summary in a single-line text field... I may have caught it, or maybe not, but I'm glad you did, thx for the ping-back! – fqdn Jul 07 '15 at 17:17
@MartinR This seems like a nice implementation although I'm having some trouble. The function returns an Int so how do you read the actual probabilities like you wrote in your example? – matiastofteby Sep 11 '17 at 11:13
1

@matiastofteby: Your question is unclear to me. `randomNumber(probabilities: [0.2, 0.3, 0.5])` returns an integer (0, 1, or 2). The return value is 0 with probability 0.2, 1 with probability 0.3, and 2 with probability 0.5. – Martin R Sep 11 '17 at 11:23
Ah okay I think I get it now. So if you want an array of values, you have to call randomNumber() multiple times. Thanks for clarifying :=) – matiastofteby Sep 11 '17 at 11:29
@MartinR nice clean code and explanation. One potential issue I see with this solution is when you have multiple probabilities of the same value (eg. [0.5, 0.5, 0.5]) it will always return the same item from the list. Not sure if there's an elegant way to solve this apart from maybe shuffle the list before feeding it into this function? – thiezn Oct 23 '19 at 18:58
1

@thiezn: Are you sure? `for _ in 1...10 { print(randomNumber(probabilities: [ 0.5, 0.5, 0.5])) }` does not print the same number for each call. – Martin R Oct 23 '19 at 19:03
@MartinR Thanks for the quick reply! You are right, I misinterpreted the implementation and didn't run a proper test myself. It indeed produces a randomised output when there are multiple equal weighted items. – thiezn Oct 23 '19 at 20:29

Caleb · Answer 2 · 2015-05-18T17:58:38.607

Is there a nicer, cleaner way to pick a random number with a certain probability when you have a large number of probabilities to consider?

Sure. Write a function that generates a number based on a table of probabilities. That's essentially what the switch statement you've pointed to is: a table defined in code. You could do the same thing with data using a table that's defined as a list of probabilities and outcomes:

probability    outcome
-----------    -------
   0.4            1
   0.2            2
   0.1            3
   0.15           4
   0.15           5

Now you can pick a number between 0 and 1 at random. Starting from the top of the list, add up probabilities until you've exceeded the number you picked, and use the corresponding outcome. For example, let's say the number you pick is 0.6527637. Start at the top: 0.4 is smaller, so keep going. 0.6 (0.4 + 0.2) is smaller, so keep going. 0.7 (0.6 + 0.1) is larger, so stop. The outcome is 3.

I've kept the table short here for the sake of clarity, but you can make it as long as you like, and you can define it in a data file so that you don't have to recompile when the list changes.

Note that there's nothing particularly specific to Swift about this method -- you could do the same thing in C or Swift or Lisp.

Thank you. What structure would you recommend to store the table? — asdf, May 18 '15 at 17:57
I did not intend to copy your idea, but I had already prepared the code which I wanted to suggest :) — Martin R, May 18 '15 at 18:04
No worries, @MartinR -- I'm sure the idea isn't original, and I'm glad someone provided an implementation. My prose nicely explains your code, and your code nicely illustrates my prose. — Caleb, May 18 '15 at 18:10

rofls · Answer 3 · 2018-06-21T09:48:44.247

3

This seems like a good opportunity for a shameless plug to my small library, swiftstats: https://github.com/r0fls/swiftstats

For example, this would generate 3 random variables from a normal distribution with mean 0 and variance 1:

import SwiftStats
let n = SwiftStats.Distributions.Normal(0, 1.0)
print(n.random())

Supported distributions include: normal, exponential, binomial, etc...

It also supports fitting sample data to a given distribution, using the Maximum Likelihood Estimator for the distribution.

See the project readme for more info.

edited Jun 21 '18 at 09:48

answered Aug 13 '16 at 21:35

rofls

4,993
3
27
37

Can you point me to instructions on how to build SwiftStats for iOS? – Nathan Bell Dec 10 '16 at 20:14
1

@NathanBell no, sorry. It's on the to-do list here: https://github.com/r0fls/swiftstats#to-do Since you asked I will try to get to it soon. You're welcome to create an issue over there in Github if you don't mind, to give me increased satisfaction once I have completed said task. Or if you manage to do it yourself, I would love a PR on the readme or code. Cheers, Raphael – rofls Dec 11 '16 at 22:17
1

you got it! Thank you! – Nathan Bell Dec 18 '16 at 00:11

oisdk · Answer 4 · 2015-05-19T14:44:20.643

You could do it with exponential or quadratic functions - have x be your random number, take y as the new random number. Then, you just have to jiggle the equation until it fits your use case. Say I had (x^2)/10 + (x/300). Put your random number in, (as some floating-point form), and then get the floor with Int() when it comes out. So, if my random number generator goes from 0 to 9, I have a 40% chance of getting 0, and a 30% chance of getting 1 - 3, a 20% chance of getting 4 - 6, and a 10% chance of an 8. You're basically trying to fake some kind of normal distribution.

Here's an idea of what it would look like in Swift:

func giveY (x: UInt32) -> Int {
  let xD = Double(x)
  return Int(xD * xD / 10 + xD / 300)
}

let ans = giveY (arc4random_uniform(10))

EDIT:

I wasn't very clear above - what I meant was you could replace the switch statement with some function that would return a set of numbers with a probability distribution that you could figure out with regression using wolfram or something. So, for the question you linked to, you could do something like this:

import Foundation

func returnLevelChange() -> Double {

  return 0.06 * exp(0.4 * Double(arc4random_uniform(10))) - 0.1

}

newItemLevel = oldItemLevel * returnLevelChange()

So that function returns a double somewhere between -0.05 and 2.1. That would be your "x% worse/better than current item level" figure. But, since it's an exponential function, it won't return an even spread of numbers. The arc4random_uniform(10) returns an int from 0 - 9, and each of those would result in a double like this:

0: -0.04
1: -0.01
2:  0.03
3:  0.1
4:  0.2
5:  0.34
6:  0.56
7:  0.89
8:  1.37
9:  2.1

Since each of those ints from the arc4random_uniform has an equal chance of showing up, you get probabilities like this:

40% chance of -0.04 to 0.1  (~ -5% - 10%)
30% chance of  0.2  to 0.56 (~ 20% - 55%)
20% chance of  0.89 to 1.37 (~ 90% - 140%)
10% chance of  2.1          (~ 200%)

Which is something similar to the probabilities that other person had. Now, for your function, it's much more difficult, and the other answers are almost definitely more applicable and elegant. BUT you could still do it.

Arrange each of the letters in order of their probability - from largest to smallest. Then, get their cumulative sums, starting with 0, without the last. (so probabilities of 50%, 30%, 20% becomes 0, 0.5, 0.8). Then you multiply them up until they're integers with reasonable accuracy (0, 5, 8). Then, plot them - your cumulative probabilities are your x's, the things you want to select with a given probability (your letters) are your y's. (you obviously can't plot actual letters on the y axis, so you'd just plot their indices in some array). Then, you'd try find some regression there, and have that be your function. For instance, trying those numbers, I got

e^0.14x - 1

and this:

let letters: [Character] = ["a", "b", "c"]

func randLetter() -> Character {

  return letters[Int(exp(0.14 * Double(arc4random_uniform(10))) - 1)]

}

returns "a" 50% of the time, "b" 30% of the time, and "c" 20% of the time. Obviously pretty cumbersome for more letters, and it would take a while to figure out the right regression, and if you wanted to change the weightings you're have to do it manually. BUT if you did find a nice equation that did fit your values, the actual function would only be a couple lines long, and fast.

But a switch statement is essentially a piecewise function, so your probability distribution function would require a switch statement to be described anyway. — asdf, May 18 '15 at 18:34

Generate random numbers with a given distribution

4 Answers4

Linked

Related