0

I am going through the source code for Math.Random and I have noticed that the source code for nextBoolean()

public boolean nextBoolean() {
   return next(1) != 0;
}

calls for a new draw of pseudorandom bits, via next(int bits) which "iterates" the LC-PRNG to the next state, i.e. "draws" a whole set of new bits, even though only one bit is used in nextBoolean. This effectively means that rest of the bits (47 to be exact) are pretty much wasted in that particular iteration.

I have looked at another PRNG which appears to do essentially the same thing, even though the underlying generator is different. Since multiple bits from the same iteration are used for other method calls (such as nextInt(), nextLong(), ...) and the consecutive bits are assumed to be "independent enough" from one another.

So my question is: Why only one bit is used from a draw of the PRNG using nextBoolean()? It should be possible to cache, say 16-bits (if one wants to use the highest quality bits), for successive calls to nextBoolean(), am I mistaken here?

Edit: What I mean by caching the results is something like this:

private long booleanBits = 0L;
private int c = Long.SIZE;
public boolean nextBoolean(){
    if(c == 0){
        booleanBits = nextLong();
        c = Long.SIZE;
    }

    boolean b = (booleanBits & 1) != 0;
    booleanBits >>>= 1;
    return b;

    //return ( next() & 1 ) != 0;
}

Sure, it's not sure and pretty as the commented out text, but it ends up in 64x less draws. At the cost of 1 int comparison, and 1 right-shift operation per call to nextBoolean(). Am I mistaken?

Edit2: Ok, I had to test the timings, see the code here. The output is as follows:

Uncached time lapse: 13891
Cached time lapse: 8672

Testing if the order matters..:
Cached time lapse: 6751
Uncached time lapse: 8737

Which suggest that caching the bits is not a computational burden but an improvement instead. A couple of things I should note about this test:

  1. I use a custom implementation of xorshift* generators that is heavily inspired from Sebastiano Vigna's work on xorshift* generators.

  2. xorshift* generators are actually much faster than Java's native generator. So if I were to use java.util.Random to draw my bits, caching would make a larger impact. Or that's what I would expect at least.

  3. Single-thread application assumed here, so no sych issues. But that's of course common in both conditions.

posdef
  • 6,498
  • 11
  • 46
  • 94
  • 1
    Quite possibly because the cost of tracking which bits are still available for use is more expensive than just discarding the unused bits and moving on. – Louis Wasserman Jan 31 '14 at 16:20
  • @LouisWasserman Figured that would be the reason but I don't see why that would be the case in the long run. Please see the edit. – posdef Jan 31 '14 at 16:57
  • There's also a possibility that the individual bits in the number are not statistically independent of one another, which might mean that it would not be sufficiently random to do this. That said, I'm no expert at analyzing randomness, so that might not actually be the case. – templatetypedef Jan 31 '14 at 17:05
  • Simplicity: a boolean is a boolean, not a bit, so whoever wrote that probably assumed that if you need N bits you may just generate all of them rather than call nextBoolean() many times. It also ensures quality -- as @templatetypedef guessed, java's default generator is rand48, so its lower bits have a shorter period (they are extra careful to use the high bits for things like "next(1)") – loreb Jan 31 '14 at 21:36

1 Answers1

1

Conditionals of any kind can be quite expensive (see Why is it faster to process a sorted array than an unsorted array?), and next itself doesn't do that many more operations: I count five arithmetic operations plus a compareAndSet, which shouldn't cost much in a single-threaded context.

The compareAndSet points out another issue -- thread-safety -- which is much harder when you have two variables that need to be kept in sync, such as booleanBits and c. The synchronization overhead of keeping those in sync for multithreaded use would almost certainly exceed the cost of a next() call.

Community
  • 1
  • 1
Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • I just ran a test and caching the bits is actually faster. Of course thread-safety is an issue that's not addressed here, but that's not what I was looking into either. Thanks for the tip on branching btw :) – posdef Jan 31 '14 at 17:52