In my embedded project, I have a biginteger class that handles arbitrary length integers. I would like to be able to generate a random bigint between 0 and an arbitrary number. Assume I have a quality source of random bytes.
All the implementations I have seen essentially do the same thing:
- Generate a big number with the correct number of bytes,
- If it is greater than max, generate again.
The problem I see with this implementation is that it could take an awfully long time. Imagine that max = 2^2049-1
=(01 FF .. FF
). This algorithm will generate 257 bytes, then check that the most significant byte is <=1
. So there is a 254/256 chance it has to generate a whole new 257 byte number. In the (admittedly unlikely) worst case, this loop could go on for minutes or years.
My question is:
In the case where the generated number is too large, is there a way to keep most of the bytes I have already generated?
Is it valid to just regenerate the most significant byte, or does that introduce bias? What about shifting the result right one digit?
Is there any way to make the time deterministic, while still avoiding bias?
--
Another edge case: max = 2^2048 + 1
= (01 00 .. 01
) In this case the most significant byte can be non-zero if the remaining bytes are 0s followed by a 00
or 01
. So most of the time, if the MSB is non-zero, than it will be invalid, and just regenerating that byte will never make it valid. But just force setting it to zero seems wrong also.