I need a repeatable pseudo-random function from floats in [0,1] to floats in [0,1]. I.e. given a 32-bit IEEE float, return a "different" one (as random as possible, given the 24 bits of mantissa). It has to be repeatable, so keeping tons of internal state is out. And unfortunately it has to work with only 32-bit int and single-float math (no doubles and not even 32x32=64bit multiply, though I could emulate that if needed -- basically it needs to work on older CUDA hardware). The better the randomness the better, of course, within these rather severe limitations. Anyone have any ideas?
(I've been through Park-Miller, which requires 64-bit int math, and the CUDA version of Park-Miller which requires doubles, Mersenne Twisters which have lots of internal state, and a few other things which didn't work.)