0

How to do comparisons of float values using series of bitwise operations?

bitmask
  • 32,434
  • 14
  • 99
  • 159
ashna
  • 1
  • 1
  • 1
  • 21
    Why would you want to do that? – Georg Fritzsche Jun 23 '10 at 16:23
  • 3
    Analogically, I might ask: how do I change the spark plugs on my bicycle? – Tim Schaeffer Jun 23 '10 at 17:20
  • 4
    @Tim I disagree with your analogy. There are no spark plugs on a bicycle. There are bits in floats. Georg's comment is much more appropriate. – corsiKa Jun 23 '10 at 18:11
  • 2
    the guy is probably not even listening to what all these people say... – Jens Gustedt Jun 23 '10 at 21:29
  • IEEE floats can be compared as sign/magnitude integers; the magnitude increases monotonically when incrementing the bit-pattern, except for NaN. (The sign bit makes it tricky.) [Compare floating point numbers as integers](https://stackoverflow.com/q/33678827) – Peter Cordes Feb 02 '23 at 18:06

9 Answers9

5

Don't. Just... don't. Use ==, or its wild and wacky neigbors > and <. There's even the crazy hybrids, <= and >=! These should cover all of your float-comparison needs.

Update: Never mind, don't use ==. The others should be fine.

Update update: Not using == means you probably shouldn't use <= or >=, either. The moral of the story is that floats are tricksy, which is why you absolutely, definitely shouldn't be trying bitwise operations on them.

JSBձոգչ
  • 40,684
  • 18
  • 101
  • 169
  • 7
    +1, funny, but I wouldn't recommend `==`. Stick to abs( x-y ) <= epsilon, where epsilon represents your precision level. – Craig Trader Jun 23 '10 at 16:31
  • Example: 1.0/3.0 - 0.1/0.3 should be 0.0, but instead is -5.5511151231257827e-17 on 64-bit Linux. – Craig Trader Jun 23 '10 at 16:43
  • 3
    If you're using an epsilon value for == (as you should) you should also use it for < and >. Otherwise you put yourself in a position where a < b and withinEpsilon(a,b) are both true. – bshields Jun 23 '10 at 18:21
2

It actually makes sense in some situations to manipulate the float numbers bitwise. For example if you are trying to model hardware with a lower bit precision, or a larger one. In those cases you will need to access the bits on the float number. As bta said, the first thing you should know is the IEEE 754 standard so that you know which bits you are manipulating.

Then you can use a solution such as the one of ShinTakezou, but I would suggest something a little bit more sophisticated.

Lets assume we want single precision.

first we declare a structure with the different floating point fields:

typedef struct s_float{
    int sign : 1;
    int exponent : 8;
    int mantissa : 23;
}my_float_struct;

then we declare a union as so:

union u_float{
    float the_float;
    my_float_struct the_structure;
}my_float;

then we can access a float in the code by doing:

my_float.the_float = <float number>;

the bit fields are used for example as follows:

printf("%f is %d.%d.%d\n",my_float.the_float,my_float.the_structure.sign,my_float.the_structure.exponent,my_float.the_structure.mantissa);

And well you can assign, modify the fields or whatever else you need.

Samica
  • 21
  • 1
1

I don't think using a bitwise operator on a float will do what you think it will do. Before doing so, make sure you're familiar with the IEEE 754 standard, which governs how floating point numbers are represented internally. While this is a perfectly valid operation, it is more than likely not very useful.

What exactly are you trying to accomplish? There is likely a better way to do it.

bta
  • 43,959
  • 6
  • 69
  • 99
0

The domain represented by float numbers doesn't fit well with their bit implementation and manipulation.

You can easily apply whatever bitwise operator on your floats but you won't obtain anything useful since these operators will modify the number in a way that simply doesn't make any sense if you want to treat them as floats..

What kind of sense would ANDing two exponents or XORing two mantissas have? I mean in practical float operations..

Jack
  • 131,802
  • 30
  • 241
  • 343
0

Floating point layout is strongly platform dependant. I remember seeing ugly hack to draw random float numbers using only integer operations, but this is definitely not portable.

Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • 1
    That's not correct. Every reasonably modern CPU on this planet uses IEEE754 FP. The only problem is big/little endian. – Axel Gneiting Jun 23 '10 at 16:33
  • 1
    @Axel, you're assuming that your CPI has floating point arithmetic. Most microcontrollers do not. – Craig Trader Jun 23 '10 at 16:50
  • 1
    @Axel: Endianness is part of floating point layout, so floating point layout *is* platform dependent. I've had to deal with this myself (in a messaging protocol). – Tim Schaeffer Jun 23 '10 at 17:17
  • @Craig Even if microcontrols do not have FP the runtime emulation will comply to IEE754. – Axel Gneiting Jun 23 '10 at 17:26
  • @Tim: I don't understand what you mean. I've already said that endianess could be a problem - but only if you access the bytes directly. – Axel Gneiting Jun 23 '10 at 17:27
  • unless IEEE specs says how floats must be stored into mem, and afaik they don't, so endianness must be taken into account each time you want to store the number and retrieve it in a different way (on in a different machine after transmission); likely cpus having fp and able to store and get them to/from memory, use the same endianness they use for any other datum. Moreover it seems necesssary if one wants to rewrite fp ops with bit-ops starting from fp nums stored in fp regs, since I think few (if any) cpus allow to do bitwise ops on fp registers – ShinTakezou Jun 23 '10 at 17:37
  • IEEE754 of course specifies how floats are stored in memory. And if you cast a float to a 32 bit integer and then the bits will be in the same order on any platform as well regardless of endianess. It is false to assume that this is "platform" dependant. E.g. this will work on ANY CPU: http://www.codemaestro.com/reviews/9 – Axel Gneiting Jun 24 '10 at 00:15
  • @Axel, casting a float to a 32-bit integer (by which you mean via a type-punned pointer, naturally) is *implementation defined behavior*. That is better than *undefined behavior* in that it is likely to have some meaning, but you must verify what meaning it will have with your choice of compiler and execution platform. – RBerteig Jun 24 '10 at 00:56
  • Yes it is implementation defined behavior, but *all implementations actually are the same*. The example I linked will work on *any* modern machine. Try it. – Axel Gneiting Jun 24 '10 at 01:19
  • @Axel Gneiting, if IEEE754 specifies how floats are stored in memory, then fpu of the intel x86 modern processor is not fully IEEE754 compliant, or it is not compliant the one on BE cpus (depending on the fact std says the numbers must be stored in BE or LE "way"). [The code](http://www.codemaestro.com/reviews/9) works on any cpu is compiled for, __of course__! But try to access the very same data in memory with a machine with a different endianness! It won't work! It works since all accesses are done on the same machine, that of course is coherent writing and reading data! – ShinTakezou Jul 02 '10 at 10:26
  • I don't understand what you're up to. Of course the byte order ca be different, but that doesn't affect code compatibility as long you access the `float` value through a cast to `long`. "Comparing floats using bitwise operators" is portable if you respect that. Endianess is a entire different story and is a problem with integers too. – Axel Gneiting Jul 02 '10 at 12:05
0

Under the really rare circumstances that you need to do this (e.g., dealing with a foreign floating point format, such as data in VAX G-format being manipulated on a PC) you normally do it by putting the floating point data into an integer of the same size, or else into an array of char (again, of the right size).

Don't expect such code to be anywhere close to portable or clean. In a typical case, you want to do as little this was as possible, typically just read in the raw data, create the closest native floating point value possible, and work with that. Even when/if you have to deal with such foreign data, doing things like comparisons in the foreign format should still generally be avoided.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
0

There's a good description at the link below which uses the trick to make a highly-optimized sort function for floating point numbers.

http://www.stereopsis.com/radix.html

Obviously, using this kind of hack is not portable and inadvisable under most circumstances.

Daniel Stutzbach
  • 74,198
  • 17
  • 88
  • 77
0

If you read carefully the IEEE specs about fp numbers, you are at the start. The next step is to implement in software what is already done by the hardware if the cpu has fp support otherwise you have to implement fp operations (according to IEEE) from scratch in software... Perfectly doable both (it was what one had to do before IEEE was used into cpus or fp coprocessors, if the IEEE 754-whatever was needed).

Let us suppose that your target can't fp at all, so you need implement it from scratch. Then it is up to you to decide how to store the number in memory (you can agree with endianness of your system or not); e.g. the float for 1.23 is stored into mem as 0xA4709D3F on my machine (LE) and in fact the "right" way is 0x3F9D70A4 (our way of writing resembles more BE than LE, but there is no a "right" way... even though this way allows us to check the datum directly with specs, so that if I write -1.23, I obtain 0xBF9D70A4, where it is clear that the sign bit is raised to 1)

But since we're going to implement it from scratch we can write the number into memory this way:

unsigned char p[4];
p[0] = 0x3f; p[1] = 0x9d; p[2] = 0x70; p[3] = 0xa4;

and then it comes the hard part... Like this:

bool is_positive(float_t *p)
{
  return ! (p[0] & 0x80); // or alike
}

We work in memory supposing our proc is not able to handle 32 bit (or more) integers. Of course I've picked the simpler operation...! The others are harder, but starting from the IEEE 754 description and doing some reasoning, you can implement what you want. As you see, it is not so easy... Somewhere you could find libraries that implements operations on fp numbers when there's no floating point unit, but now I was not able to find any (examples could be Amiga mathieeedoubbas.library, but I think you can't find the sources for this, and anyway it could be directly in m68k asm...; just to say that software impl could exist somewhere...)

ShinTakezou
  • 9,432
  • 1
  • 29
  • 39
0

Convert to integral type. Then You can use all the binary operations with it. Beware that conversion is temporary (type conversion is like a function returning value interpreted as if it had different type).

float real_value;
(int)real_value | 3, // this will work
real_value | 3; // This will not work
user712092
  • 1,988
  • 1
  • 14
  • 16