It is well known that built-in C++ types uint32_t, int32_t, uint64_t, int64_t and even GCC/CLang built-in types __int128 and unsigned __int128, all have sizeof exactly equal to bit-width divided by 8.
But if you print sizeof of Boost's boost::multiprecision::uint256_t
or uint512_t
then you get 48 bytes instead of 32 bytes for uint256_t, and 80 bytes instead of 64 bytes for uint512_t. Both types have exactly 16 bytes more sizeof than expected. See demo here.
But sizeof of boost::multiprecision::uint128_t
gives exactly 16 bytes as expected.
It appeared that base class of all integers of Boost cpp_int_base
has several fields:
data_type m_data;
unsigned m_limbs;
bool m_sign, m_internal, m_alias;
Only m_data field contains bits of integer, while other fields give this extra 16 unnecessary bytes of sizeof.
My question is it possible to tweak Boost multiprecision integer somehow so that it contains only data bits and nothing else?
In other words it keeps sign (if it is signed integer) same as Intel CPU keeps it inside int64_t, basically highest bit is sign bit, and the rest of bits is encoded sign-complimentary form. So that Boost integer is encoded samely as default's intel uint64_t, int64_t.
If you look into signature of template cpp_int_base
then you see:
template <unsigned MinBits, unsigned MaxBits, cpp_integer_type SignType,
cpp_int_check_type Checked, class Allocator, bool trivial = false>
struct cpp_int_base;
Apparantely trivial
seems to Almost do what is necessary, specialization of cpp_int_base with trivial = true contains just two fields:
local_limb_type m_data;
bool m_sign;
So just 1 byte bigger than minimal possible sizeof.
More than that 128-bit integer has trivial = true by default, while bigger integers have trivial = false.
But there is no way to control trivial
template parameter, because if you look at uint256_t definition then you see:
using uint256_t = number<cpp_int_backend<256, 256,
unsigned_magnitude, unchecked, void> > ;
and cpp_int_backend has no trivial
parameter among its template params, only cpp_int_base has this trivial inside. But cpp_int_base is not accessibly to user, it is internal detail of library.
Also I don't know how 128-bit integer has exactly 16 bytes of sizeof, because as I showed above even trivial param has extra bool m_sign;
field, which should give extra 1 byte (i.e. 17 sizeof). But somehow 128-bit integer is not 17 bytes as expected, but 16 bytes in size.
Why I need boost integer to have exactly minimal amount of bits. Because in my program I have millions of integers in array. And besides usual math I do my own special math operations over these integers. My math operations work on usual Intel form of integers, same as int64_t and uint64_t are represented. But sometimes I need regular operations like + - * / % ^ | ~
, and not to implement them I decided to use Boost multiprecision library.
If Boost had exactly same representation as Intel I would just do reinterpret_cast<boost::multiprecision::uint256_t &>(array[i]) *= 12345;
, without any intermediate conversion or memcpy. But as Boost has a different format I have to write custom conversion back and forth.
If I do e.g. ^
operation for example then it takes 1-4 CPU cycles for 256-bit integer. And if I do conversion to/from Boost format then it would take 5-10 cycles more, which is a really big overhead.
Thus this trivial Intel format of Boost integers is needed for me as optimization, not to do conversion on every single operation.
One more less important reason is that somewhere in my templated code I need to figure out what bit-width of a number of templated type T was given. If this is always a trivial Intel format then sizeof(T) * 8
will give exact bit width of a number. While for Boost format I need to specialize some helper templated structure like BitWidthOf<T>::value
.