-2
flt32 flt32_abs (flt32 x) {
    int mask=x>>31;

    printMask(mask,32);
    puts("Original");
    printMask(x,32);

    x=x^mask;

    puts("after XOR"); 
    printMask(x,32);

    x=x-mask;

    puts("after x-mask");
    printMask(x,32);
    return x;
}

Here's my code, calling the function on the value -32 is returning .125. I'm confused because it's a pretty straight up formula for abs on bits, but I seem to be missing something. Any ideas?

Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • Is `flt32` some sort of nonstandard floating point format on which bit shifting is valid? See also http://stackoverflow.com/q/1723575/2564301 – Jongware Feb 06 '16 at 00:00
  • Jongware, by occasion, I've linked to that question in the answer below as well. Join you on the `flt32` type question. – iksemyonov Feb 06 '16 at 00:02

2 Answers2

2

Is flt32 a type for floating point or fixed point numbers?

I suspect it's a type for fixed point arithmetic and you are not using it correctly. Let me explain it.

A fixed-point number uses, as the name says, a fixed position for the decimal digit; this means it uses a fixed number of bits for the decimal part. It is, in fact, a scaled integer.

I guess the flt32 type you are using uses the most significant 24 bits for the whole part and the least significant 8 bits for the decimal part; the value as real number of the 32-bit representation is the value of the same 32 bit representation as integer, divided by 256 (i.e. 28).

For example, the 32-bit number 0x00000020 is interpreted as integer as 32. As fixed-point number using 8 bits for the decimal part, its value is 0.125 (=32/256).

The code you posted is correct but you are not using it correctly.

The number -32 encoded as fixed-point number using 8 decimal digits is 0xFFFFE000 which is the integer representation of -8192 (=-32*256). The algorithm correctly produces 8192 which is 0x00002000 (=32*256); this is also 32 when it is interpreted as fixed-point.

If you pass -32 to the function without taking care to encode it as fixed-point, it correctly converts it to 32 and returns this value. But 32 (0x00000020) is 0.125 (=1/8=32/256) when it is interpreted as fixed-point (what I assume the function printMask() does).

How can you test the code correctly?

You probably have a function that creates fixed-point numbers from integers. Use it to get the correct representation of -32 and pass that value to the flt32_abs() function.

In case you don't have such a function, it is easy to write it. Just multiply the integer with 256 (or even better, left-shift it 8 bits) and that's all:

function int_to_fx32(int x)
{
    return x << 8;
}

The fixed-point libraries usually use macros for such conversions because they produce faster code. Expressed as macro, it looks like this:

#define int_to_fx32(x) ((x) << 8)

Now you do the test:

fx32 negative = int_to_fx32(-32);
fx32 positive = fx32_abs(negative);
// This should print 32
printMask(positive, 32);

// This should print 8192
printf("%d", positive);
// This should print -8192
printf("%d", negative);

// This should print 0.125
printMask(32, 32);
axiac
  • 68,258
  • 9
  • 99
  • 134
0
int flt32_abs (int x) {
^^^            ^^^
    int mask=x>>31;
    x=x^mask;
    x=x-mask;
    return x;
}

I've been able to fix this and obtain the result of 32 by changing float to int, else the code wouldn't build with the error:

error: invalid operands of types 'float' and 'int' to binary 'operator>>'

For an explanation of why binary operations on floats are not allowed in C++, see

How to perform a bitwise operation on floating point numbers

I would like to ask more experienced developers, why did the code even build for OP? Relaxed compiler settings, I guess?

Community
  • 1
  • 1
iksemyonov
  • 4,106
  • 1
  • 22
  • 42