__float128 rounding

Question

How can I round __float128 in C++ to get __int128? I found some rounding functions in quadmath.h but their result is long long or something even shorter or integer stored in __float128. This question isn't duplicate of Why do round() and ceil() not return an integer? because I use 128-bit numbers and casting doesn't work for them.

max of `__int128` is `2^127 + 1` which is much smaller than some big `__float128`, hence `round(__float128)` need to return a `__float128` — Danh, Sep 11 '16 at 15:45
Possible duplicate of [Why do round() and ceil() not return an integer?](http://stackoverflow.com/questions/1253670/why-do-round-and-ceil-not-return-an-integer) — Danh, Sep 11 '16 at 15:46
But I need the result to be __int128 and I will use it only for __float128, which can be also represented as __int128. In that linked question, the solution is cast but it doesn't work for 128-bit numbers and it always returns 0. — Martin Schmied, Sep 11 '16 at 16:42
I can't try it, but just assigning one to the other doesn't work? — MikeMB, Sep 11 '16 at 16:47
Yes, actually it works but I had another error in my program, so it seemed like the assignment or casting has always 0 as a result. Thanks for finding this error even if it is another problem than I thought. — Martin Schmied, Sep 11 '16 at 17:10
@MartinSchmied: Then I'd suggest, you either modify the question accordingly or delete it altogether — MikeMB, Sep 11 '16 at 21:27
`__float128` can represent a float up to ≈ 1.1897 × 10^4932 which isn't representable by `__int128` which is in range `-(2^128)` (or `-2^127 -1` in some system) to `2^127+1`. This question is exactly duplicate with my suggestion. You need to use `roundf128` than check if that value bigger than or equal with `2^128` and `-(2^128)`, if ok than make a cast — Danh, Sep 12 '16 at 01:30

Danh · Answer 1 · 2016-09-12T02:05:37.130

__int128 can only represent an integer which is in range -2¹²⁸ (or -2¹²⁷ - 1 in some system) to 2¹²⁷ + 1.

__float128 can represent a float up to 2¹⁶³⁸⁴ - 2¹⁶²⁷¹ ≈ 1.1897 × 10⁴⁹³² which isn much bigger than __int128.

You need to:

use roundq to get the rounded __float128 than.
check if that value stays in range [-2¹²⁸, 2¹²⁸], these numbers are 1 outside the limit of __int128 and both of them can be represented correctly by a float because they're a power of 2.
if it is in that range, make a cast to __int128

Alternately, from gcc documentation you can use llroundq: round to nearest integer value away from zero. But in this case, quote from libquadmath source code:

  else
    {
      /* The number is too large.  It is left implementation defined
     what happens.  */
      return (long long int) x;
    }

__float128 rounding

1 Answers1