18

Rationals are enumerable. For example this code finds k-th rational in open interval 0..1, with ordering that {n1, d1} is before {n2, d2} if (d1<d2 || (d1==d2 && n1<n2)) assuming {n,d} is coprime.

RankedRational[i_Integer?Positive] := 
 Module[{sum = 0, eph = 1, den = 1},
  While[sum < i, sum += (eph = EulerPhi[++den])];
  Select[Range[den - 1], CoprimeQ[#, den] &][[i - (sum - eph)]]/den
  ]

In[118]:= Table[RankedRational[i], {i, 1, 11}]

Out[118]= {1/2, 1/3, 2/3, 1/4, 3/4, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6}

Now I would like to generate random rationals, given an upper bound on the denominator sort-of uniformly, so that for large enough denominator rationals will be uniformly distributed over the unit interval.

Intuitively, one could pick among all rationals with small denominators with equal weights:

RandomRational1[maxden_, len_] := 
 RandomChoice[(Table[
     i/j, {j, 2, maxden}, {i, 
      Select[Range[j - 1], CoprimeQ[#, j] &]}] // Flatten), len]

Can one generate random rationals with this distribution more efficiently, without constructing all of them ? It does not take much for this table to become huge.

In[197]:= Table[RankedRational[10^k] // Denominator, {k, 2, 10}]

Out[197]= {18, 58, 181, 573, 1814, 5736, 18138, 57357, 181380}

Or maybe it is possible to efficiently generate rationals with bounded denominator having a different "feels-like-uniform" distribution ?


EDIT This is Mathematica code which runs acceptance-rejection generation suggested by btilly.
Clear[RandomFarey];
RandomFarey[n_, len_] := Module[{pairs, dim = 0, res, gcds},
  Join @@ Reap[While[dim < len,
      gcds = cfGCD[pairs = cfPairs[n, len - dim]];
      pairs = Pick[pairs, gcds, 1];
      If[pairs =!= {}, 
       dim += Length@Sow[res = pairs[[All, 1]]/pairs[[All, 2]]]];
      ]][[2, -1]]
  ]

The following compiled function generations pairs of integers {i,j} such that 1<=i < j<=n:

cfPairs = 
  Compile[{{n, _Integer}, {len, _Integer}}, 
   Table[{i, RandomInteger[{i + 1, n}]}, {i, 
     RandomChoice[2 (n - Range[n - 1])/(n (n - 1.0)) -> Range[n - 1], 
      len]}]];

and the following compiled function computes gcd. It assumes the input is a pair of positive integers.

cfGCD = Compile[{{prs, _Integer, 1}}, Module[{a, b, p, q, mod},
    a = prs[[1]]; b = prs[[2]]; p = Max[a, b]; q = Min[a, b]; 
    While[q > 0, mod = Mod[p, q]; p = q; q = mod]; p], 
   RuntimeAttributes -> Listable];

Then

In[151]:= data = RandomFarey[12, 10^6]; // AbsoluteTiming

Out[151]= {1.5423084, Null}

In[152]:= cdf = CDF[EmpiricalDistribution[data], x];

In[153]:= Plot[{cdf, x}, {x, 0, 1}, ImageSize -> 300]

enter image description here

Sasha
  • 5,935
  • 1
  • 25
  • 33

3 Answers3

8

I strongly suggest looking at The "guess the number" game for arbitrary rational numbers? for some inspiration about your underlying problem.

If your goal is to be approximately uniform ASAP, and you don't mind picking different rationals with different probabilities, the following algorithm should be efficient.

lower = fractions.Fraction(0)
upper = fractions.Fraction(1)

while lower < upper:
    mid = (upper + lower)/2
    if 0 == random_bit():
        upper = largest_rational_under(mid, denominator_bound)
    else:
        lower = smallest_rational_over_or_equal(mid, denominator_bound)

Note that both of those two helper functions can be calculated by walking the Stern-Brocot Tree towards the mid. Also note that, with some minor modification, you can easily transform this into an iterative algorithm that spits out a sequence of rational numbers, and eventually will converge with equal likelihood anywhere in the interval. I consider that property to be kind of nice.


If you want the exact distribution that you originally specified, and rand(n) gives you a random integer from 1 to n, then the following pseudocode will work for denominator bound n:

Try:
    k = rand(n * (n+1) / 2)
    do binary search for largest j with j * (j-1) / 2 < k
    i = k - (j * (j-1) / 2)
    if (i, j) are not relatively prime:
        redo Try
answer = i/j

On average for large n you'll have to Try about 2.55 times. So in practice this should be pretty efficient.

Community
  • 1
  • 1
btilly
  • 43,296
  • 3
  • 59
  • 88
  • Thank you. This is exactly what I was looking for. This latter algorithm is much better suited for vectorization in Mathematica. As I understand your algorithm, you simply select pairs `(i,j)` with `i – Sasha Apr 19 '11 at 21:44
3

With a bound on the denominator, the rationals aren't uniformly distributed (1/2 gets separated from everything else by a nice gap, for example.

That said, would something like

In[300]:= Rationalize[RandomReal[1, 10], 0.001]

Out[300]= {17/59, 45/68, 11/31, 9/16, 1/17, 13/22, 7/10, 1/17, 5/21, 8/39}

work for you?

Brett Champion
  • 8,497
  • 1
  • 27
  • 44
  • Yes, but as the denominator increases rationals get closer and closer. My goal is to generate rationals so that `CDF` of `EmpiricalDistribution` on the sample would tend to straight line as the denominator bound get larger. Consider `cdf = CDF[EmpiricalDistribution[RandomRational1[50, 10^4]], x]; Plot[{x, cdf}, {x, 0, 1}, Exclusions -> None, PlotPoints -> 250, PlotStyle -> {Black, Orange}]`. Also Cramer von-Mises test indicates the distribution is close to uniform `Histogram[ Table[CramerVonMisesTest[RandomRational1[100, 10^3], UniformDistribution[]], {10^3}]]` – Sasha Apr 19 '11 at 19:10
  • It is unfortunate that images can not be used in comments. Evaluate `cdf1 = CDF[EmpiricalDistribution[RandomRational1[25, 10^4]], x];cdf2 = CDF[EmpiricalDistribution[Rationalize[RandomReal[1, 10^4], 1/25]], x]; Row@{Plot[{cdf1, x}, {x, 0, 1}, ImageSize -> 250], Plot[{cdf2, x}, {x, 0, 1}, ImageSize -> 250]}` You see that rationalized random reals tend to random real a lot slower than those generated from RandomRational1. – Sasha Apr 19 '11 at 19:31
  • @Sasha RandomRational1 generates ~5x as many distinct rationals as the Rationalize approach... – Brett Champion Apr 19 '11 at 19:41
1

Here are some random thoughts on the problem you raise. I haven't carefully checked the math so I could be off by 1 here or there. But it represents the sort of reasoning I would follow.

Let's only consider fractions in the interval (0,1). It's much easier that way. We can deal later with 1/1 and improper fractions.

The Stern - Brocot Tree uniquely lists each reduced positive common fraction (and hence each positive rational number less than or equal to one) once, in order, and in reduced form, as a node in the tree. In this binary tree, any node and thus any fraction can be reached by a finite sequence of left-right turns starting from the uppermost level (for convenience let's call it level -1), containing 0/1 and 1/0. [Yes, 1/0. That's not a misprint!]

Given a denominator, k, you would need to take at most k turns to reach any reduced fraction j/k, where j is less than k. For example, if the denominator were 101, all possible fractions with a denominator of 101 or less will be in the tree somewhere between Level 1 (containing 1/1) and Level 101 (containing 1/101 in the leftmost position).

Let's assume we have a number generator that generate 0's and 1's. (Please don't ask me how to do that; I have no idea.) Lef's arbitrarily decide that Left=0 and Right=1.

Assume that we have another number generator that can randomly generate integers between 1 and n. Assume further that the first number generated is 0, ie. turn left: this guarantees that the fraction will fall in the interval (0,1).

Select the maximum denominator, k. Randomly generate a number, m, between 1 and k. Then generate a random list of R's and L's. Traverse (i.e. descend) the Stern-Brocot tree, following the list of turns. Stop when you reach the destination fraction.

If that fraction has a denominator equal to or less than k, stop, that's your number.

If the denominator is greater than k, ascend the tree (along the same path you descended) until you reach a fraction with a denominator no greater than k.

I don't know that the number generation is truly random. I wouldn't even know how to tell. But for what it's worthe, I don't detect any obvious source of bias.

DavidC
  • 3,056
  • 1
  • 20
  • 30
  • This algorithm will work, but won't converge to anything that is even remotely close to a uniform distribution. – btilly Apr 19 '11 at 21:09
  • @btilly Please explain. Will it favor some fractions over others? – DavidC Apr 19 '11 at 21:14
  • @David What is the formula for the [L,L,R,R,L] (or anyone) position in the tree? – Dr. belisarius Apr 19 '11 at 21:42
  • @belisarius You'll find it in the source code of this demonstration. (Download the source code; it's not too long). http://demonstrations.wolfram.com/FractionTreeAndContinuedFractions/ Left and Right are matrices. Knuth's Concrete Mathematics lays out the underlying rationale for the code I developed. I'd explain it but I think you'll quickly figure it out more quickly from the code in the demo. – DavidC Apr 19 '11 at 21:53
  • 1
    @david-carraher: It will strongly favor some regions over others. For instance the region from 0 to 0.05 can only be reached if the first 10 choices are "L", which only happens about 0.1% of the time. – btilly Apr 19 '11 at 22:59