Why do STD implementations of mt19937 have double sizeof as boost version?

Question

I have this simple C++ program with unexpected output:

#include<random>
#include<iostream>
#include "boost/random/mersenne_twister.hpp"
#include "boost/random/uniform_int_distribution.hpp"

int main(){
    std::cout << sizeof(std::mt19937) << std::endl;
    std::cout << sizeof(std::mt19937_64) << std::endl;
    std::cout << sizeof(boost::random::mt19937) << std::endl;
    std::cout << sizeof(boost::random::mt19937_64) << std::endl;
}

both clang and gcc output

5000

2504

2504

2504

What I find interesting is that sizeof standard implementation of mt19937(32bit one) is around 2x the the boost version, while 64bit ones match perfectly.

Since MT uses a lot of space it is not really a small difference.

Also it is weird that implementation of a strictly specified algorithm would have such different sizeof, we are not talking about std::string where implementers might pick different SSO buffer size...

My best guess would be that boost either has a bug or that it implements some slightly different version of mt19937, but wikipedia says this, suggesting boost might be right:

Relatively large state buffer, of 2.5 KiB,

edit: both boost and std version seem to satisfy requirement that 1000th generated value is 4123659995, so there seems to be no bug in boost.

@eerorika ugly :) it uses some kind of a buffer that is templated on many things... https://github.com/llvm/llvm-project/blob/master/libcxx/include/random search for 624 (that is magic number in mt19973) obviously not certain I am linking the same code as the one in godbolt. — NoSenseEtAl, Aug 26 '20 at 16:01
@NoSenseEtAl `std::mt19937` and `std::mt19937_64` are both required to be `typedef`s for specific specializations of [`std::mersenne_twister_engine`](https://en.cppreference.com/w/cpp/numeric/random/mersenne_twister_engine) which is the template you are seeing. You'll want to look at `std::mersenne_twister_engine` instead. — François Andrieux, Aug 26 '20 at 16:06

eerorika · Accepted Answer · 2020-08-26T17:41:44.493

4

This is the standard definition:

mersenne_twister_engine<
    uint_fast32_t, // element of the buffer
    32,
    624,           // size of the buffer
    397, 31,
    0x9908b0df, 11,
    0xffffffff, 7,
    0x9d2c5680, 15,
    0xefc60000, 18, 1812433253>

The issue is that GNU chose that std::uint_fast32_t is a 64 bit type on 64 bit systems (whether that is a good or bad choice is a separate discussion). Thus, the buffer is twice as big as one would have expected if the buffer contained 32 bit integers.

This is the Boost definition:

mersenne_twister_engine<
    uint32_t,
    32,
    624,
    397, 31,
    0x9908b0df, 11,
    0xffffffff, 7,
    0x9d2c5680, 15,
    0xefc60000, 18, 1812433253>

Which is identical, except for using a fixed width element which is alway same on all systems.

You can use std::mersenne_twister_engine directly with std::uint_least32_t element to get around this issue. Using this alias is preferable to the fixed alias because it is required to be supported on all systems.

edited Aug 26 '20 at 17:41

answered Aug 26 '20 at 16:25

eerorika

232,697
12
197
326

I saw this typedef , but I expected uint_fast32_t to be of size 4... stupid me :) – NoSenseEtAl Aug 26 '20 at 16:29
I honestly thing this is a bug, but if opened I think it will just be ignored... – NoSenseEtAl Aug 26 '20 at 16:29
@NoSenseEtAl I agree, this could be considered a defect in the standard. On the other hand, committees perspective might simply be that GNU / Linux made a stupid choice. And they might agree, but cannot change due to backward compatibility. – eerorika Aug 26 '20 at 16:32
There is a slight chance you could get the C++ standard to relax the requirement so an implementation of mt19937 can use any unsigned type with at least 32 bits, not necessarily uint_fast32_t, but then implementations would be unlikely to change anything for ABI reasons. – Marc Glisse Aug 26 '20 at 16:38
@MarcGlisse That would be a reasonable change. It wouldn't fix anything though, except for theoretical future language implementation. – eerorika Aug 26 '20 at 16:39
@eerorika interesting that `uint_fast32_t` would be 64bits on intel platforms... when intel's own optimization manual (and AMDs) specify that 32bit operations should be preferred. This seems more like a GCC implementation bug IMO. Not sure if it's fixable however due to ABI. MSVC uses `unsigned int` for example. – Mgetz Aug 26 '20 at 16:41
@eerorika ah it is a bug in the standard, even better :D Then again as you say it is not their fault that linux made the wrong decision wrt uint_fast32_t – NoSenseEtAl Aug 26 '20 at 16:51
@Mgetz in VS 2019 I also get size of 5000 :( – NoSenseEtAl Aug 26 '20 at 16:53
@NoSenseEtAl When targeting windows? I find this surprising. – eerorika Aug 26 '20 at 16:55
@eerorika you can try it on godbolt(it is old, but I think they did not break abi since then). Also I tried it on my VS2019 now. https://godbolt.org/z/9a3ds9 – NoSenseEtAl Aug 26 '20 at 16:58
@Mgetz The choice appears to have been made in 1999, when `sysdeps/generic/stdint.h` was written for `stdint.h` of C99. Prior to that, `sysdeps/wordsize-64/inttypes.h` defined `int_fast32_t` as `int`. Afterwards, all `int_fastX_t` were defined as `long int` for 64 bit systems by default (those that have single word width unlike x86-64). That may make sense for some 64 bit systems. No x86-64 processors would be released until 2003. – eerorika Aug 26 '20 at 17:09
@eerorika fair, and in all honesty looking at agner's tables it really only matters for mul and div instructions by and large anyway. Although the throughput is definitely better for 32bit operands from a port perspective even on add and sub instructions. – Mgetz Aug 26 '20 at 17:10
@eerorika I think fast types are stupid anyway since sometimes you are limited by compute, sometimes by memory bandwidth/cache, so it is impossible to determine the "fast" type in general case. – NoSenseEtAl Aug 26 '20 at 17:17
@NoSenseEtAl They can be reasonable choice when you haven't yet measured whether a specific integer type would be significantly faster for a particular use case. Publishing an API before measuring whether specific integer size is appropriate would not be ideal. If one cannot change the type in a public API, then "fast" alias isn't appropriate for the API. – eerorika Aug 26 '20 at 17:20
1

@NoSenseEtAl depends on platform. They were designed for platforms like Alpha AXP where the 'fast' type is actually native word and everything else is stupid slow because it takes multiple instructions. [Raymond Chen's blog about it](https://devblogs.microsoft.com/oldnewthing/20170808-00/?p=96775) – Mgetz Aug 26 '20 at 17:21
Re: "a fixed width element which is alway same on all systems" -- except for the ones where it doesn't exist. The fixed-width type aliases are all **optional**. The `...fast...` and `...least...` variations are all required. – Pete Becker Aug 26 '20 at 17:25
@eerorika why do you suggest uint_least32_t instead of uint32_t? – NoSenseEtAl Aug 26 '20 at 17:27
1

@NoSenseEtAl See the sentence following that suggestion. Also (essentially) repeated with more detail by Pete Becker in the comment above. – eerorika Aug 26 '20 at 17:28
@eerorika ah in theory there are systems where uint32_t does not exist... wonderful :) I will just make my code platform specific and use uint32_t – NoSenseEtAl Aug 26 '20 at 17:30

Why do STD implementations of mt19937 have double sizeof as boost version?

1 Answers1