6

I'm writing a Monty Hall simulator, and found the need to generate a number within a range, excluding a single number.

This seemed easy, so I naively wrote up:

(The g/... functions are part of my personal library. Their use should be fairly clear):

(defn random-int-excluding
  "Generates a random number between min-n and max-n; excluding excluding-n.
  min-n is inclusive, while max-n is exclusive."
  [min-n max-n excluding-n rand-gen]
  (let [rand-n (g/random-int min-n max-n rand-gen)
        rand-n' (if (= rand-n excluding-n) (inc rand-n) rand-n)]
    (g/wrap rand-n' min-n (inc max-n))))

This generates a random number within the range, and if it equals the excluded number, adds one; wrapping if necessary. Of course this ended up giving the number after the excluded number twice the chance of being picked since it would be picked either if it or the excluded number are chosen. Sample output frequencies for a range of 0 to 10 (max exclusive), excluding 2:

([0 0.099882]
 [1 0.100355]
 [3 0.200025]
 [4 0.099912]
 [5 0.099672]
 [6 0.099976]
 [7 0.099539]
 [8 0.100222]
 [9 0.100417])

Then I read this answer, which seemed much simpler, and based on it, wrote up:

(defn random-int-excluding
  "Generates a random number between min-n and max-n; excluding excluding-n.
  min-n is inclusive, while max-n is exclusive."
  [min-n max-n excluding-n rand-gen]
  (let [r1 (g/random-int min-n excluding-n rand-gen)
        r2 (g/random-int (inc excluding-n) max-n rand-gen)]
    (if (g/random-boolean rand-gen) r1 r2)))

Basically, it splits the range into 2 smaller ranges: from the min to the excluded number, and from excluded number + 1 to the max. It generates random number from these ranges, then randomly chooses one of them. Unfortunately though, as I noted under the answer, this gives skewed results unless both the partitions are of equal size. Sample output frequencies; same conditions as above:

([0 0.2499497]
 [1 0.2500795]
 [3 0.0715849]
 [4 0.071297]
 [5 0.0714366]
 [6 0.0714362]
 [7 0.0712715]
 [8 0.0715285]
 [9 0.0714161])

Note the numbers part of the smaller range before the excluded number are much more likely. To fix this, I'd have to skew it to pick numbers from the larger range more frequently, and really, I'm not proficient enough in maths in general to understand how to do that.

I looked at the accepted answer from the linked question, but to me, it seems like a version of my first attempt that accepts more than 1 number to exclude. I'd expect, against what the answerer claimed, that the numbers at the end of the exclusion range would be favored, since if a number is chosen that's within the excluded range, it just advances the number past the range.

Since this is going to be one of the most called functions in the simulation, I'd really like to avoid the "brute-force" method of looping while the generated number is excluded since the range will only have 3 numbers, so there's a 1/3 chance that it will need to try again each attempt.

Does anyone know of a simple algorithm to chose a random number from a continuous range, but exclude a single number?

Community
  • 1
  • 1
Carcigenicate
  • 43,494
  • 9
  • 68
  • 117

5 Answers5

6

To generate a number in the range [a, b] excluding c, simply generate a number in the range [a, b-1], and if the result is c then output b instead.

amalloy
  • 89,153
  • 8
  • 140
  • 205
  • It's funny, I actually just got it working when you posted this. I realized I *waaaaay* overthought the problem. I'll accept this, but I'll also post my solution for substance. Thanks. – Carcigenicate Nov 08 '16 at 23:40
  • 1
    It may be helpful to you to think of problems like this in two parts: first, how much entropy do you need (how large a range should you randomly select from); and second, how can you map a number from that range onto the result set you want. From those two questions the correct answer falls out pretty simply. – amalloy Nov 08 '16 at 23:45
  • This is dangerous since it brakes down for non-uniform number generators. I'd just do an lazy seq. Sketch: `(keep #(...) (repeatedly #(rand-int ...)))` – ClojureMostly Nov 09 '16 at 09:09
  • @ClojureMostly You don't get rid of bias by moving it around. – Thumbnail Nov 09 '16 at 10:02
2

If the size of the collection of acceptable answers is small, just put all values into a vector and use rand-nth:

http://clojuredocs.org/clojure.core/rand-nth

(def primes [ 2 3 5 7 11 13 17 19] )
(println (rand-nth primes))
(println (rand-nth primes))
(println (rand-nth primes))

~/clj > lein run
19
13
11

Update

If some of the values should include more than the others, just put them in the array of values more than once. The number of occurrances of each value determines its relative weight:

(def samples [ 1 2 2 3 3 3 4 4 4 4 ] )
(def weighted-samples 
  (repeatedly #(rand-nth samples)))

(println (take 22 weighted-samples))

;=> (3 4 2 4 3 2 2 1 4 4 3 3 3 2 3 4 4 4 2 4 4 4)

If we wanted any number from 1 to 5, but never 3, just do this:

(def samples [ 1 2   4 5 ] )
(def weighted-samples 
  (repeatedly #(rand-nth samples)))

(println (take 22 weighted-samples))

(1 5 5 5 5 2 2 4 2 5 4 4 5 2 4 4 4 2 1 2 4 1)
Alan Thompson
  • 29,276
  • 6
  • 41
  • 48
  • I must be missing something. I don't understand how `rand-nth` can be used in this way to get a random number from a range, excluding a certain value. – Carcigenicate Nov 09 '16 at 17:48
2

Just generate a lazy sequence and filter out items you don't want:

(let [ignore #{4 2}]
  (frequencies
    (take 2000
          (remove ignore (repeatedly #(rand-int 5))))))

Advantage to the other approach of mapping to different new values: This function will also work with different discrete random number distributions.

ClojureMostly
  • 4,652
  • 2
  • 22
  • 24
  • 1
    This is actually what I ended up doing. I tried using my above implementation , then realized that I do in fact need to exclude multiple values, so I rewrote it to use filter since that made much more sense. Then I realized the way I was going about it was super roundabout, so I deleted half of what I wrote and went to bed. – Carcigenicate Nov 09 '16 at 17:44
1

Just to show the implementation I wrote, here's what worked for me:

(defn random-int-excluding
  "Generates a random number between min-n and max-n; excluding excluding-n.
  min-n is inclusive, while max-n is exclusive."
  [min-n max-n excluding-n rand-gen]
  (let [rand-n (g/random-int min-n (dec max-n) rand-gen)]
    (if (= rand-n excluding-n)
      (dec max-n)
      rand-n)))

Which gives a nice even distribution:

([0 0.111502]
 [1 0.110738]
 [3 0.111266]
 [4 0.110976]
 [5 0.111162]
 [6 0.111266]
 [7 0.111093]
 [8 0.110815]
 [9 0.111182])
Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
  • This is way to much work, and harder to understand. Just draw a random number in the original interval [a,b]. If you get the value `c` you don't want, just repeat. Converges exponentially and super simple. Or, put all "acceptable" outputs into an array or set, then randomly select an index into that. – Alan Thompson Nov 09 '16 at 00:14
  • @AlanThompson As I noted at the bottom of the question, I don't want a repetitious algorithm. The number ranges are so small, there will be a 1/3 chance *per recurse* that it will need to try again. – Carcigenicate Nov 09 '16 at 01:49
  • Please see solution using `rand-nth`. It is made for this type of problem. :) – Alan Thompson Nov 09 '16 at 01:56
1

Just to make Alan Malloy's answer explicit:

(defn rand-int-range-excluding [from to without]
  (let [n (+ from (rand-int (dec (- to from))))]
    (if (= n without)
      (dec to)
      n)))

(->> #(rand-int-range-excluding 5 10 8)
     repeatedly
     (take 100)
     frequencies)
;{6 28, 9 22, 5 29, 7 21}

No votes required :).

Community
  • 1
  • 1
Thumbnail
  • 13,293
  • 2
  • 29
  • 37
  • To repeat: This will break down as soon as you don't draw from a uniform distribution (which `rand-int` does). – ClojureMostly Nov 09 '16 at 10:47
  • @ClojureMostly What do you mean by *break down*? What statistical criterion of randomness does your filtered pseudo-random sequence pass that Alan Malloy's' part-shifted one fails? Neither are [uniform](https://en.wikipedia.org/wiki/Discrete_uniform_distribution). – Thumbnail Nov 09 '16 at 13:30
  • I mean that you change the distribution of the random number that you generate. Use a Binomial or Geometric distribution and you will end up with bad data. The approach only works with uniformly distributed data. – ClojureMostly Nov 09 '16 at 14:15
  • @ClojureMostly For the Monty Hall problem, the numbers are just tokens that identify the doors. They have no Peano construction. Concepts like *binomial* or *geometric* do not apply. The requirement is simply to avoid opening the already open door. Besides, a binomial distribution with a hole in it is no longer binomial. Its mean might be right, if you were lucky enough to eliminate the central value, but its variance would be wrong, just for starters. I think you have demonstrated that Alan Malloy's solution would be the wrong answer to a different question. – Thumbnail Nov 09 '16 at 14:55
  • I can provide incorrect answers to an unbounded number of questions in a single day, really. – amalloy Nov 09 '16 at 21:11