4

I would need to add some randomization to my Cortex M0 CPU firmware. The randomness is not important but the speed is.

I tested two functions I found online. With random() I managed to generate 1 number per 31 clock cycles, while random_uint() generated 1 number in 20 cycles. My target is less than 10. What are other functions can I use?

unsigned random() {
    unsigned b;

    b = t1 ^ (t1 >> 2) ^ (t1 >> 6) ^ (t1 >> 7);
    t1 = (t1 >> 1) | (~b << 31);

    b = (t2 << 1) ^ (t2 << 2) ^ (t1 << 3) ^ (t2 << 4);
    t2 = (t2 << 1) | (~b >> 31);

    return t1 ^ t2;
}

unsigned random_uint() {
    m_z = 36969 * (m_z & 65535) + (m_z >> 16);
    m_w = 18000 * (m_w & 65535) + (m_w >> 16);
    return (m_z << 16) + m_w;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
kelvin
  • 41
  • 1
  • 6
  • 1
    If "randomness is not important" there is [this possibility](https://www.xkcd.com/221/). If that isn't adequate, perhaps randomness is important after all. Please clarify. – John Coleman Sep 26 '18 at 09:35
  • A more serious suggestion, you could look into the [middle-square Weyle sequence](https://en.wikipedia.org/wiki/Middle-square_method). The code at the bottom of that article seems likely to be fast. – John Coleman Sep 26 '18 at 09:49

3 Answers3

3

Well, I believe it is 32bit architecture, and fastest probably would be LCG RNG. Using values from Numerical Recipes, with period of 232

inline uint32_t random_u32(uint32_t prev) {
    return prev*1664525U + 1013904223U; // assuming complement-2 integers and non-signaling overflow
}

int main() {
    auto seed = 1U;
    uint32_t = result;

    result = seed = random_u32(seed);
    result = seed = random_u32(seed);
    result = seed = random_u32(seed);
    ...
    return 0;
}

Actually, even simpler version but with smaller period, what is used in C++-11 minstd, 231-1 period.

inline uint32_t random_u32(uint32_t prev) {
    return prev*48271U;
}

Seed shall not be 0, start with some odd number like 134775813U

Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
  • 1
    @kelvin Reading https://developer.arm.com/products/processors/cortex-m/cortex-m0, Enhanced instructions: single cycle 32x32 multiplication, my latter RNG might really be fastest option, fitting into under 10 cycles constraint – Severin Pappadeux Sep 27 '18 at 14:45
  • Yes Severin. You are right and Thank you. Your latter RNG saves 1 addition from the code I finally used. I choose that addition to make it less guestimable. – kelvin Sep 28 '18 at 01:25
  • @kelvin. Good. That addition is one additional bit of randomness. Without it you would be getting all odd numbers and no even ones. Obviously, twice as much period as well. Good luck! – Severin Pappadeux Sep 28 '18 at 02:19
1
static unsigned int g_seed;

// Used to seed the generator.           
inline void fast_srand(int seed) {
    g_seed = seed;
}

// Compute a pseudorandom integer.
// Output value in range [0, 32767]
inline int fast_rand(void) {
    g_seed = (214013*g_seed+2531011);
    return (g_seed>>16)&0x7FFF;
}

From here: Faster than rand()?

Nic Wilson
  • 150
  • 3
  • 10
  • That's a good solution. I settled with this code, 12 cycles used for that computation. unsigned random_uint() { m_z = 36969 * (m_z & 65535) + (m_z >> 8); return m_z; } – kelvin Sep 27 '18 at 02:50
1

Some of the fastest generators are from the Xorshift family discovered by George Marsaglia, a brilliant contributor to both PRNG design and PRNG testing. The Xorshift link has sample code. The implementations given use three shifts and three XORs, which should take one cycle apiece and would be pretty hard to beat.

pjs
  • 18,696
  • 4
  • 27
  • 56