Getting absolute value from a binary int using bit arithmetics

Question

flt32 flt32_abs (flt32 x) {
    int mask=x>>31;

    printMask(mask,32);
    puts("Original");
    printMask(x,32);

    x=x^mask;

    puts("after XOR"); 
    printMask(x,32);

    x=x-mask;

    puts("after x-mask");
    printMask(x,32);
    return x;
}

Here's my code, calling the function on the value -32 is returning .125. I'm confused because it's a pretty straight up formula for abs on bits, but I seem to be missing something. Any ideas?

Is `flt32` some sort of nonstandard floating point format on which bit shifting is valid? See also http://stackoverflow.com/q/1723575/2564301 — Jongware, Feb 06 '16 at 00:00
Jongware, by occasion, I've linked to that question in the answer below as well. Join you on the `flt32` type question. — iksemyonov, Feb 06 '16 at 00:02

score 2 · Answer 1 · answered Feb 06 '16 at 01:07

Is flt32 a type for floating point or fixed point numbers?

I suspect it's a type for fixed point arithmetic and you are not using it correctly. Let me explain it.

A fixed-point number uses, as the name says, a fixed position for the decimal digit; this means it uses a fixed number of bits for the decimal part. It is, in fact, a scaled integer.

I guess the flt32 type you are using uses the most significant 24 bits for the whole part and the least significant 8 bits for the decimal part; the value as real number of the 32-bit representation is the value of the same 32 bit representation as integer, divided by 256 (i.e. 2⁸).

For example, the 32-bit number 0x00000020 is interpreted as integer as 32. As fixed-point number using 8 bits for the decimal part, its value is 0.125 (=32/256).

The code you posted is correct but you are not using it correctly.

The number -32 encoded as fixed-point number using 8 decimal digits is 0xFFFFE000 which is the integer representation of -8192 (=-32*256). The algorithm correctly produces 8192 which is 0x00002000 (=32*256); this is also 32 when it is interpreted as fixed-point.

If you pass -32 to the function without taking care to encode it as fixed-point, it correctly converts it to 32 and returns this value. But 32 (0x00000020) is 0.125 (=1/8=32/256) when it is interpreted as fixed-point (what I assume the function printMask() does).

How can you test the code correctly?

You probably have a function that creates fixed-point numbers from integers. Use it to get the correct representation of -32 and pass that value to the flt32_abs() function.

In case you don't have such a function, it is easy to write it. Just multiply the integer with 256 (or even better, left-shift it 8 bits) and that's all:

function int_to_fx32(int x)
{
    return x << 8;
}

The fixed-point libraries usually use macros for such conversions because they produce faster code. Expressed as macro, it looks like this:

#define int_to_fx32(x) ((x) << 8)

Now you do the test:

fx32 negative = int_to_fx32(-32);
fx32 positive = fx32_abs(negative);
// This should print 32
printMask(positive, 32);

// This should print 8192
printf("%d", positive);
// This should print -8192
printf("%d", negative);

// This should print 0.125
printMask(32, 32);

score 0 · Answer 2 · edited May 23 '17 at 11:45

0

int flt32_abs (int x) {
^^^            ^^^
    int mask=x>>31;
    x=x^mask;
    x=x-mask;
    return x;
}

I've been able to fix this and obtain the result of 32 by changing float to int, else the code wouldn't build with the error:

error: invalid operands of types 'float' and 'int' to binary 'operator>>'

For an explanation of why binary operations on floats are not allowed in C++, see

How to perform a bitwise operation on floating point numbers

I would like to ask more experienced developers, why did the code even build for OP? Relaxed compiler settings, I guess?

edited May 23 '17 at 11:45

Community

1
1

answered Feb 06 '16 at 00:00

iksemyonov

4,106
1
22
42

This code assume `int` is 32-bit 2's complement. Common, but not defined by C. – chux - Reinstate Monica Feb 06 '16 at 19:20
Note: `int mask=x>>31;` --> implementation defined behavior. "An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right" C11 §3.4.1 – chux - Reinstate Monica Feb 06 '16 at 19:22
Is the downvote by you? Shall I remove the answer? Until having read the answer above, I had had no idea about "fixed point", since it's sadly not taught at school. – iksemyonov Feb 06 '16 at 20:03
when did you receive the DV? – chux - Reinstate Monica Feb 06 '16 at 20:10
Not sure about that, it was already there a few hours ago, when I first logged since yesterday. – iksemyonov Feb 06 '16 at 20:12
And, I'm going to brush up on bit representation and arithmetics soon, now seeing my own deficiencies. – iksemyonov Feb 06 '16 at 20:15
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/102805/discussion-between-chux-and-iksemyonov). – chux - Reinstate Monica Feb 06 '16 at 20:15
Learnt to find the timestamp of votes, it was not by chux, but by someone else. My apologies. – iksemyonov Feb 06 '16 at 20:21

Getting absolute value from a binary int using bit arithmetics

2 Answers2