This code computes the ULP of a number, assuming an IEEE-754 binary format:
typedef double FloatType;
/* Return the ULP of q.
This was inspired by Algorithm 3.5 in Siegfried M. Rump, Takeshi Ogita, and
Shin'ichi Oishi, "Accurate Floating-Point Summation", _Technical Report
05.12_, Faculty for Information and Communication Sciences, Hamburg
University of Technology, November 13, 2005.
Since this uses subnormals, it may have poor performance.
*/
FloatType ULP(FloatType q)
{
static const FloatType Epsilon = std::numeric_limits<FloatType>::epsilon();
static const FloatType Minimum = std::numeric_limits<FloatType>::denorm_min();
/* Scale is .75 ULP, so multiplying it by any significand in [1, 2) yields
something in [.75 ULP, 1.5 ULP) (even with rounding).
*/
static const FloatType Scale = 0.75 * Epsilon;
q = std::fabs(q);
/* In fmaf(q, -Scale, q), we subtract q*Scale from q, and q*Scale is
something more than .5 ULP but less than 1.5 ULP. That must produce q
- 1 ULP. Then we subtract that from q, so we get 1 ULP.
The significand 1 is of particular interest. We subtract .75 ULP from
q, which is midway between the greatest two floating-point numbers less
than q. Since we round to even, the lesser one is selected, which is
less than q by 1 ULP of q, although 2 ULP of itself.
*/
return std::fmax(Minimum, q - std::fma(q, -Scale, q));
}