TL;DR:
If you want to get the position of the highest 1 bit then you're doing it completely wrong with the slowest and most error-prone way. See the solution at the end
log2()
receives a double
, but long
and long long
in your platform have 64 bits of precision, which is far more than double
can store, as it's probably IEEE-754 binary64 and has only 53 significant bits. The closest to ULLONG_MAX
in double
is ULLONG_MAX + 1.0 = 264 and obviously log2(ULLONG_MAX) = log2(264) = 64. You can never get 63 by using double
like that
If you want to get the base 2 logarithm of such numbers then you'll need a type with more precision like long double
on some platforms, and also a good log2
library (see below for why that matters). On x86 long double
is usually 80-bit extended precision with 64 significant bits and can store ULLONG_MAX
without problem
#include <stdio.h>
#include <math.h>
#include <quadmath.h>
#include <limits.h>
#include <float.h>
int main()
{
printf("sizeof(long) = %zu\n", sizeof(long));
printf("sizeof(long long) = %zu\n", sizeof(long long));
printf("sizeof(double) = %zu\n", sizeof(double));
printf("sizeof(long double) = %zu\n", sizeof(long double));
printf("double has %d significant bits\n", DBL_MANT_DIG);
printf("long double has %d significant bits\n", LDBL_MANT_DIG);
printf("-----------------------------------------------------\n");
printf("ULONG_MAX = %lu\n", ULONG_MAX);
printf("ULLONG_MAX = %llu\n", ULLONG_MAX);
printf("(double)ULONG_MAX = %f\n", (double)ULONG_MAX);
printf("(double)ULLONG_MAX = %f\n", (double)ULLONG_MAX);
printf("(long double)ULONG_MAX = %Lf\n", (long double)ULONG_MAX);
printf("(long double)ULLONG_MAX = %Lf\n", (long double)ULLONG_MAX);
printf("-----------------------------------------------------\n");
printf("ul_int (double):\t\t\t%d\n", (int)log2(ULONG_MAX));
printf("ull_int (double):\t\t\t%d\n", (int)log2(ULLONG_MAX));
printf("ul_int (long double):\t\t\t%d\n", (int)log2l((long double)ULONG_MAX));
printf("ull_int (long double):\t\t\t%d\n", (int)log2l((long double)ULLONG_MAX));
printf("ul_int (18446744073709551615.0L):\t%d\n",
(int)log2l(18446744073709551615.0L));
printf("ul_int (__float128):\t\t\t%d\n", (int)log2q((__float128)ULONG_MAX));
printf("ull_int (__float128):\t\t\t%d\n", (int)log2q((__float128)ULLONG_MAX));
printf("ull_int (18446744073709551615.0q):\t%d\n",
(int)log2q(18446744073709551615.0q));
}
Demo on Godbolt. Sample output:
sizeof(long) = 8
sizeof(long long) = 8
sizeof(double) = 8
sizeof(long double) = 16
double has 53 significant bits
long double has 64 significant bits
-----------------------------------------------------
ULONG_MAX = 18446744073709551615
ULLONG_MAX = 18446744073709551615
(double)ULONG_MAX = 18446744073709551616.000000
(double)ULLONG_MAX = 18446744073709551616.000000
(long double)ULONG_MAX = 18446744073709551615.000000
(long double)ULLONG_MAX = 18446744073709551615.000000
-----------------------------------------------------
ul_int (double): 64
ull_int (double): 64
ul_int (long double): 64
ull_int (long double): 64
ul_int (18446744073709551615.0L): 64
ul_int (__float128): 63
ull_int (__float128): 63
ull_int (18446744073709551615.0q): 63
Notice that ULONG_MAX
can't be represented in double
precision as I mentioned previously. But also notice that even in long double
we get log2l(18446744073709551615.0L) = 64
!!! Only __float128
which is libquadmath's IEEE-754 quadruple precision works. Why? Because log
and other transcendental functions are very complex and aren't required to be faithfully rounded by IEEE-754, so implementations are allowed to use a much faster algorithm but may return some results with 1ULP error. The result on Godbolt above is for glibc and you need to find some better log2
library as I said above. See
Update:
As commented by chux below, the result might be faithfully rounded in this case but unfortunately the closest long double
value to log218446744073709551615 = 63.999999999999999999921791345121706111... is 64.0L
That means you still need higher precision to get the expected output
But probably you're doing it the wrong way. If you just want to get the position of the highest 1 bit then never use log2()
!!! It's super slow and is prone to floating-point errors like above. Most architectures have an instruction to get the result in 1 or a few cycles. In C++20 just use std::bit_width(x)
or the equivalent
return std::numeric_limits<T>::digits - std::countl_zero(x);
In older C++ versions you can use boost::multiprecision::msb(x)
, boost::static_log2(x)
. In C you'll need implementation-specific solutions like
There are also other fast bitwise solutions in