how to implement exponentiation for fixed-point numbers in c

Question

I need to raise a fixed-point number to the third and fifth power, but the standard pow method doesn't work. what to do in this situation

As @largest_prime_is_463035818 says; plus if `std::pow` is not working for you then you probably have precision issues see https://stackoverflow.com/questions/588004/is-floating-point-math-broken — Richard Critten, Apr 28 '21 at 11:13
Implement multiplication properly first, then use repeated multiplication. — molbdnilo, Apr 28 '21 at 11:25
Does this answer your question? [Fast fixed point pow, log, exp and sqrt](https://stackoverflow.com/questions/4657468/fast-fixed-point-pow-log-exp-and-sqrt) — L. Scott Johnson, Apr 28 '21 at 11:31
@Ciganeshima C++ does not have built-in support for fixed-point (fixed-precision) numbers; we either have integer or floating point types. For true fixed-point (precision) you either need to roll your own; scale the values and use integer arithmetic; or use an external fixed-precision maths library. — Richard Critten, Apr 28 '21 at 11:39

John McFarlane · Answer 1 · 2021-04-30T09:14:18.057

(Note that when I answered this question, it was tagged with C++. The basic techniques still work for C but the library solution does not.)

Given a wide-enough integer type, raise the underlying integer by the necessary power, and return it as a fixed-point type with number of fractional digits multiplied by the same power.

E.g. for x^3, if the input, x, is a std::int32_t integer representing a 15.16 fixed-point number (i.e. with 16 fractional digits), then the result will be a 94-bit-wide integer with value x*x*x, representing a 45.48 fixed-point number. And for x^5, the result will be a 156-bit-wide integer with value x*x*x*x*x, representing a 75.80 fixed-point number. These results will be lossless.

Unfortunately, even with extended fundamental integer type, __int128_t, you cannot represent a 156-bit value. You could, instead, follow up each multiplication with a scaling conversion back to the input type. This may lose precision or exceed the range of the type. But with fixed-point arithmetic, it's incumbent on the user to ensure their type is wide enough. (Fixed-point is, after all, just a general case of integer arithmetic.)

Here's an example of a pow function using the multiply/scale alternation using raw integers:

// integer representing a signed 15.16 fixed-point number
using fp15_16 = std::int32_t;

constexpr auto to_fp{65536};
constexpr auto from_fp{1./65536};

fp15_16 pow_fixed(fp15_16 x, int y)
{
    assert(y>=0);
    return (y == 0) ? to_fp : ((std::int64_t{x} * pow_fixed(x, y-1)) * from_fp);
}

int main()
{
    // value 3.5 represented manually as fixed-point
    auto x{fp15_16(3.5 * to_fp)};
    std::cout << (pow_fixed(x, 3) * from_fp) << '\n'; 
    std::cout << (pow_fixed(x, 5) * from_fp) << '\n'; 
}

A fixed-point library can manage the widening and conversion (example) and keep an eye out for overflow (example).

@Spektre it's an overload I define in the global namespace. So long as I don't include , it should be the only such name. However, probably best to avoid confusion/collision here so renamed to `pow_fixed`, thanks. — John McFarlane, Apr 29 '21 at 12:53

score 1 · Answer 2 · answered Apr 28 '21 at 11:32

a^b = a * a ... * a

b times. As a result, doing something like

double prod = 1;
for (int power = 0; power < b; power++) prod *= a;

Now, you can reduce the number of steps. If b happens to be the integer result of a logarithm on base 2, then you can:

double prod = a;
for (int power = 1; power < log2b; power++) prod *= prod;

However, that's an unreliable assumption, so let' assume that you have lo2b as the largest integer power of 2 that's smaller than b. In that case you can do:

double prod = a;
for (int power = 1; power < log2b; power++) prod *= prod;
for (int furtherPower = power; furtherPower < b; furtherPower++) prod *= a;

and finally, you could first calculate the power up to the largest integer logarithm of b, then decrease b accordingly and repeat the process for the new b, until you finish the operation.

One might say that computing a logarithm is very costly. True. Yet, one can precompute an array of integer logarithms and reuse it again and again for performance optimization.

Finally, there is a problem with the boundaries data types have and having values to compute that are higher than the given boundaries. In that case one will need to use some data structures that support larger values. As this is not part of this discussion, I will not get into details.

EDIT: The problem of trailing zeroes

If you have issues with the limitations of the decimals, then you can multiply your numbers, resulting in integers and then, at the end divide by the number you need in order to achieve the correct numbers.

i think the crux of the question is the fixed point. Multiplying the number and then dividing the result is unfortuantely not that easy. Consider `0.11`^5. Calculating 11^5 is fine, but then you need to divide the result by `10000000000` which exceeds `MAX_INT` — 463035818_is_not_an_ai, Apr 28 '21 at 12:01
Yeah, the delicate part would be the overflow detection though. That's not easy because it involves checking the nth root of `std::numeric_limits::max()` — MatG, Apr 28 '21 at 12:11
If the result is smaller than the base, then we do not need to check nth root. We could check that as the first validator. However, this depends on the boundaries of the task. If we have a bignumber issue, then we need to implement a more complicated issue, of course. — Lajos Arpad, Apr 28 '21 at 12:42
its not a "big number issue". It becomes a big number issue when the naive way of transforming fixed point to integer and the result back to fixed point is applied. — 463035818_is_not_an_ai, Apr 28 '21 at 14:26
@largest_prime_is_463035818 (nice name!) My point was that we might end up having a big number issue. Your scenario was also addressed by my previous comment. — Lajos Arpad, Apr 28 '21 at 15:18

score 1 · Answer 3 · edited Aug 14 '22 at 20:50

For integer exponents you can do power by squaring. However in case your exponent is also fixed point you have to use different approach like log,exp.

For more info on both see (in there is all you need):

Power by squaring for negative exponents

Now with fixed point you have to take in mind you need to correct your numbers back to their scale after each operation:

n = bits_after_decimal_point
m = 1<<n
Integer(a) = Real(x*m)
Real(x) = Integer(a)/m
x*m + y*m = (x+y)*m         -> (a+b)
x*m - y*m = (x-y)*m         -> (a-b)
x*m * y*m = (x*y)*m*m       -> (a*b)>>n 
x*m / y*m = (x/y) + (x%y)/y -> (a/b)<<n + ((a%b)<<n)/b

so +,- are the same however *,/ needs to shift the result by n bits and also require more bits to store the subresult so you need

16bit*16bit = 32bit
32bit/16bit = 16bit
32bit%16bit = 16bit

The easiest way around this is temporarily use 32bit variables during *,/ operations or use CPU instructions which usually covers this natively.

If that is not an option as you can see above the division can be partially done on 16bit directly using modulo (and or iterate or rewrite to your own divider) however for multiplication you can use naive or Karatsuba multiplications. In the log,exp sublink in the linked answer is C++ implementation with debug info.

Also take a look at this:

Can't make value propagate through carry

It's my C++ 32bit ALU implementation I use as a building block for bignum arithmetics. If you pay attention to the mul,div operations they are in 2 variants one using x86 CPU instruction set and the other native C++ to do exactly what you need ...

If you want to use the floating point pow from math (it will be slow) to compute z=x^y with Q1.15 x,z and int y then try this:

z = float(pow(float(x)/32768.0,y)*32768.0);

However that negates all advantages of using fixed point math !!!

how to implement exponentiation for fixed-point numbers in c

3 Answers3