Mercenne Twister engine performance with varying upperbound

Question

I have a question similar to this.

To summarize , I want to a random number generated in a function call, in the range [0 maxVal]. The upper bound (maxVal) is different everytime function is called. I tried to test the performance of the code in the answer and my own way to generate a random number (with a wrapper around std::mt19937)

According to my performance tests, using a uniform_int_distribution object in the function gives me inferior performance.

Here is my code

#include <iostream>
#include <random>
#include <cassert>
#include <chrono>
#include <thread>

class CustomMtRand
{

private:

    std::mt19937 mtRandEngine;

public:

    CustomMtRand(int seed) :mtRandEngine(seed) {}
    //range [0, maxVal - 1]
    unsigned int randomInt(unsigned int maxVal)
    {
        assert(maxVal != 0); //do not pass in 0 as maxVal.
        auto randVal = mtRandEngine(); // generates in range [0 4294967295]
        double scaling = (randVal / 4294967296.0);
        unsigned int retVal = std::floor(maxVal * scaling);
        return retVal;
    }
};

int main()
{

    using std::chrono::high_resolution_clock;
    using std::chrono::duration_cast;
    using std::chrono::duration;
    using std::chrono::milliseconds;


    CustomMtRand customMtRand(0);


    constexpr int numTests = 5;
    duration<double, std::milli> sumCustom(0);
    duration<double, std::milli> sumMtRand(0);

    for (int j = 0; j < numTests; j++)
    {
        constexpr long numTestIters = 0xffffff;

        auto t1 = high_resolution_clock::now();
        for (auto i = 0; i < numTestIters; i++)
        {
            auto rand = 1 + std::rand() % 10;
            customMtRand.randomInt(rand);

        }
        auto t2 = high_resolution_clock::now();
        duration<double, std::milli> execTime = t2 - t1;
        std::cout << "Custom Rand          :" << execTime.count() << " ms\n";
        sumCustom += execTime;

        std::mt19937 mtRandEngine;
        t1 = high_resolution_clock::now();
        for (auto i = 0; i < numTestIters; i++)
        {
            unsigned int rand = 1 + std::rand() % 10;
            std::uniform_int_distribution<unsigned int>{0, rand}(mtRandEngine);
        }
        t2 = high_resolution_clock::now();
        execTime = t2 - t1;
        std::cout << "MT Rand              :" << execTime.count() << " ms\n";
        sumMtRand += execTime;
    }

    std::cout << "--------------------------------\n";
    std::cout << "Custom Rand Average  :" << (sumCustom.count() / numTests) << " ms\n";
    std::cout << "MT Rand Average      :" << (sumMtRand.count() / numTests) << " ms\n";

    return 0;
}

Here is an example result

Custom Rand          :216.012 ms
MT Rand              :297.231 ms
Custom Rand          :217.576 ms
MT Rand              :315.173 ms
Custom Rand          :216.519 ms
MT Rand              :324.193 ms
Custom Rand          :214.586 ms
MT Rand              :302.604 ms
Custom Rand          :213.5 ms
MT Rand              :296.29 ms
--------------------------------
Custom Rand Average  :215.639 ms
MT Rand Average      :307.098 ms

I have two questions.

Is the way I am testing this correct?
If this is correct, is there an alternative to writing a wrapper?

EDIT: I ran the code on quick-bench.com. Here is what I tried

static void cRandCall(benchmark::State& state) {
  // Code before the loop is not measured
  CustomMtRand customMtRand(0);
   for (auto _ : state) {
     unsigned int rand = 1 + std::rand() % 10;
     int result = customMtRand.randomInt(rand);
     benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(cRandCall);


static void mtRandCall(benchmark::State& state) {
  // Code before the loop is not measured
   std::mt19937 mtRandEngine;
  for (auto _ : state) {
    unsigned int rand = 1 + std::rand() % 10;
    int result = std::uniform_int_distribution<unsigned int>{0, rand}(mtRandEngine);
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(mtRandCall);

My code still appears to be little bit faster.

you arent using the results so you cannot be sure what parts are optimized away once you do turn on optimizations. Benchmarking isnt that simple. I suggest to use this: https://quick-bench.com/ — 463035818_is_not_an_ai, Oct 17 '22 at 20:55
You also have an issue where `uniform_int_distribution` gives you the range `[min, max]`, where yours does `[0, max)` — NathanOliver, Oct 17 '22 at 20:57
Don't time with `high_resolution_clock`. It promises high resolution, not accuracy. For example, it might not be monotonic and could jitter around. [Here's the advice of one of the people most responsible for modern C++ time-keeping](https://stackoverflow.com/a/37440647/4581301) — user4581301, Oct 17 '22 at 21:09
@NathanOliver I believe so. I am running on visual studio, I see the flags /O2 - maximum optimization(Favor speed) . Yes you are right. My range is incorrect, but I believe this shouldn't change the result of any performance test. — torres, Oct 17 '22 at 21:43
@463035818_is_not_a_number @463035818_is_not_a_number Thanks for pointing this out!. This is exacly the reason. I made it `volatile int result = std::uniform_int_distribution{0, rand}(mtRandEngine); `. I believe assigning to a volatile variable will make sure the compiler doesnt optimize it. Now mine is slower :) — torres, Oct 17 '22 at 22:11
@463035818_is_not_a_number On quick-bench.com , though, mine appears to be faster. — torres, Oct 17 '22 at 22:26
Before you worry about which is faster, you should perform a statistical analysis on yours to prove it’s actually generating the numbers uniformly. Fast is useless if it’s not correct. — Taekahn, Oct 18 '22 at 03:48
@Taekahn I think it can be shown mathematically than mine folllows the same distibution as the mercene twister engine. — torres, Oct 18 '22 at 04:09
`std::rand() % 10;` is certainly not uniform. Only if `RAND_MAX % 10 == 0` it is uniform. Consider eg `RAND_MAX % 10 == 2` then `1` and `2` appear once more often than the other results — 463035818_is_not_an_ai, Oct 18 '22 at 11:03
@Taekahn Now I see what you are saying, yes I although mine appears to be a sampled version of Mercenne Twister, I am not sure if it is really uniform. The documentation says it is k-distributed. I am not a statistics expert, so I really dont know what this means. Thanks for pointing that out. — torres, Oct 18 '22 at 15:13
@463035818_is_not_a_number Yes, But I added that just to 'simulate' varying the upper bound. I encountered this issue when writing an MCTS AI for a game I am developing. I think just using mtRandEngine and uniform_int_distribution is enough because its certainly not worse enough to warrant writing a new class instead of a heavily tested method. So I will not be using my method. Thanks! — torres, Oct 18 '22 at 15:16

Mercenne Twister engine performance with varying upperbound

0 Answers0