4

I have a file of true random bytes. I want a function that returns a random integer in the range given by taking a byte from the file and scaling it. (is this the right word?)

public int getInt(int l, int h) throws IOException {
    int m = (h - l) + 1;            // number of ranges needed
    int r = 256 / m;                // size of byte range
    int x = (r * m) - 1;            // maximum allowable byte value
    int b;
    do {
        try {                       // get random byte from file
            b = ram.readUnsignedByte();
        } catch (EOFException e) {  // catch EOF, reset pointer
            b = 255; ram.seek(0);   // and set b to maximum value
        }                           // so test will fail.
    } while(b > x);                 // if byte is greater than
                                    // allowable value, loop.
    return (b / r) + l;             // return random integer
}                                   // within requested range

So here is my function. I am worried about destroying the true randomness of the bytes in the file by scaling it. I read that I need to throw out any number that would be above an allowed max value (so for a number 0-9, the max value is 249 because I only have 7 values left to distribute to 10 different groups). Does my implementation look correct?

Also, I am wondering, just by invalidating certain bytes that are too large, am I skewing the distribution in any way?

chrissphinx
  • 364
  • 3
  • 14
  • Sorry, missed that you were provided a range. Have to disappear, so I deleted my answer, as it didn't handle range. – T.J. Crowder Dec 04 '12 at 09:32
  • That's okay, I like your idea of reading in more bytes and shifting them for larger values, though. I will probably use that later but just want to make sure this basic implementation is not destroying any randomness of the file. – chrissphinx Dec 04 '12 at 09:34
  • A 32 bit integer consists of four bytes. So you can safely read 4 bytes and treat them as one signed int. – Omar Al-Ithawi Dec 04 '12 at 09:45
  • I'm not trying to treat a block of 4 bytes as an int, rather use a single byte to generate a random number within a range passed to the function (`int l, int h`) – chrissphinx Dec 04 '12 at 10:07
  • @user1684045 To be clear: So you want to take just *one* byte, and then scale it between those values, creating "gaps" if `h-l>255`? – hyde Dec 05 '12 at 07:46

1 Answers1

1

Yes, to avoid bias, you can't use modulo, you have to throw out results which are not in range.

Key to success in programming is splitting your task up in suitable sub-tasks. Quick spec:

  1. add a function to calculate how many bits are needed to store a given number
  2. add a class which reads and buffers bytes from the randomness file, and has method to give you an integer with some number of bits taken from the file (rest of bits 0).
  3. add the actual method to get your random number:
    • calculate the range of result, and from it calculate the number of bits needed
    • loop getting the bits, adding lower bound, retrying if result more than upper bound

Note about step 2: first implementation can be pretty crude, you can just get 4 bytes as integer and throw out extra bits, for example. Later you can optimize this class to keep unused bits and use them next time, to avoid wasting random bits. Since getting genuinely good random bits is usually somewhat expensive, this optimization is probably worth doing for serious use.

For bit operations, see for example this SO question: Java "Bit Shifting" Tutorial?

Community
  • 1
  • 1
hyde
  • 60,639
  • 21
  • 115
  • 176
  • Good idea, any suggestion on how to keep the unused bits? – chrissphinx Dec 04 '12 at 22:16
  • No time to write full example code, but simplest is to add member variables `int lastReadByte;` and `int usedBitsInLastReadByte;` and in a method you can also have the temp variable `int unusedBits = 8 - this.usedBitsInLastReadByte;` and then rest is just bit manipulation and reading next byte when all 8 from current byte are used, see the link I added to the answer itself. – hyde Dec 05 '12 at 07:33