How is floating-point value converted into integer under the hood

Question

First of all, I am quite unaware of a low-level machine representation of data (so, please, be kind if I misinterpret/misunderstand some things - an advice/correction is always welcome though)

All the data is, obviously, presented as sequences of 0's and 1's.

Integer numbers are just plain bits of information that can be converted to any numeral system (from the binary one).

Floating-point numbers are, however, represented as sign + exponent + fraction (let's speak in terms of IEEE 754 Floating Point Standard) which are indeed just the same old bits (0's and 1's) and that are indeed can be converted (differently, of course) to any numeral system.

How does it "magically" happen that when you do a simple casting operation (see the example below), you actually get the correct result?:

double a = 5.12354e3; // 5123.54
int b = int(a); // 5123

What is the logic inside the computing machine that converts sign + exponent + fraction to sign + value? It does not seem to be just a "plain" cast (you had 4/8 bytes before - you get 4 bytes after), right?

P.S.: If I am just not getting a very basic, obvious thing, sorry. Anyway, please, explain.

There's no magic, the compiler inserts instructions that take the binary sequence A and turn it to the sequence B. The cast is an instruction to the virtual machine. — StoryTeller - Unslander Monica, May 05 '17 at 21:09
"I am quite unaware of a low-level machine representation of data" - then, maybe... just maybe... try to learn more about it anywhere online? Or in some book? I mean, this has been discussed thoroughly lots of times already. — ForceBru, May 05 '17 at 21:10
The CPU does some very clever stuff indeed. There are probably in-depth analyses of the exact conversion for typical `double`s and `int`s out there on the internet. This question might be too broad for this site. — Bathsheba, May 05 '17 at 21:10
@ForceBru "try to learn more about it anywhere online" - isn't that a such place? — andrgolubev, May 05 '17 at 21:13
Compile a test program ask the compiler to output the assembly listing or use one of the online-code/assembly view web sites. — Richard Critten, May 05 '17 at 21:15
no it's not like casting a pointer, there are CPU level instructions to convert basic types. You can check this by casting a float pointer to an int pointer, it won't get converted properly if you do that. — Jason Lang, May 05 '17 at 21:15
@andrgolubev, well, this isn't a place where people write articles about how binary works or how compilers work with it. — ForceBru, May 05 '17 at 21:16
Can't answer, but I think there are some dedicated CPU instruction. If I google for "x86 double to int instruction" for example this links comes out http://x86.renejeschke.de/html/file_module_x86_id_56.html. Looks like CPU knows how to do that. — Jack, May 05 '17 at 21:19
@ForceBru well, one can provide a link to the paper if the question is sort of an off-topic or the answer is too big — andrgolubev, May 05 '17 at 21:19
@chux agree. It did compile with GCC in C++ for me (unsure about C, will stick to your expertise) — andrgolubev, May 05 '17 at 21:22
@andrgolubev, please check [this](http://stackoverflow.com/questions/11418952/conversion-between-float-and-int-byte-representation) and [this](http://stackoverflow.com/questions/12342926/casting-float-to-int-bitwise-in-c) SO questions. Google might also provide you with some interesting read on the topic. — ForceBru, May 05 '17 at 21:25
@andrgolubev - First you must understand IEEE 754. Suggest also reading IEEE 854 (Decimal Floating Point Standard). — Rick James, May 07 '17 at 04:20
When the compiler sees `5.1e3`, it will turn it into `5100`, which is exactly representable in int/float/double/etc. So, I don't know where `5123.54` came from?? — Rick James, May 07 '17 at 04:22
@RickJames funny thing that no one mentioned it before (probably not even noticed) — andrgolubev, May 07 '17 at 08:57
Neither `5.12354e3` nor `5123.54` can be represented _exactly_ in IEEE 754, whether single or double. And the single representation, when converted to double, will not equal the double representation. Etc. — Rick James, May 07 '17 at 15:55

score 3 · Answer 1 · answered May 05 '17 at 21:21

It's a simple matter of inserting a convert double to integer instruction that every processor with floating point has.

double a = 5.1e3; // 5123.54
MOVFD #5123.54, A(SP)
int b = int(a); // 5123
CVTDL  A(SP), B(SP)

As far as the mechanics of that all you have to do is insert a 1 bit in front of the mantissa (floating points are normally stored with an implicit one); bit shift by the exponent; then correct for the sign.

score 1 · Answer 2 · answered May 05 '17 at 23:07

You already did it for us 5123.54 can be represented as 5.12354 * 10^3 right, we learned that in grade school, so if I want to convert that to fixed point I take 5.12354 and shift it left three times and chop off the fraction 5123.54 then chop off to the right of the decimal point, and get 5123. No magic that is how it happens in hardware 1.11011 * 2^3 I shift it left three times 1110.11 and chop off the fraction. giving 1110.

The floating point formats will vary but are still sign and mantissa times 2 to the power something. the term is floating point but in the binary representation the point is at a known place just like with scientific notation but more strict, you have a 1.something for non-zero values and then the times 2 to the power something, so if the power is negative the answer is zero, simple if the power is positive then you figure out how many mantissa bits to chop off...

So for example the above 1.11011 * 2^3 lets say the mantissa of my floating point format is 6 bits 111011 I know where the decimal point is and know the power of two so I am really not shifting left three I am in this case shifting right 2. giving 1110. just depends on how you want to look at it.

then there is the sign extension for negative numbers you simply tack those on the top of the mantissa.

Again floating point formats vary in subtle ways, but at the end of the day all of the floating point math (as well as fixed point) is no different than what you learned in grade school, it is just simpler since it is base two not base ten...

How is floating-point value converted into integer under the hood

2 Answers2