-1

I´m trying to solve this task, but I don´t really know how. We are supposed to find the smallest IEEE Floating point number "b", so that 1-b != 1.

I know that the number is supposed to look like this: 0|BIAS-1|111...1. But I don´t know how to get there.

Edit: Thank you so much for your answers, I finally was able to solve with the help of your explanations.

Goel77
  • 1
  • 1
  • Have you tried anything? – 500 - Internal Server Error Nov 02 '22 at 18:22
  • 4
    Are you supposed to find it by writing a program to find it or by figuring it out from the properties of the floating-point format that have been taught in class? – Eric Postpischil Nov 02 '22 at 23:02
  • 1
    @Goel77, In C, 1) find difference between 1 and next smallest `1.0 - nextafter(1,0)`. 2) The take half this difference and find the next larger value: `nextafter((1.0 - nextafter(1,0))/2, 1)` -- This might be the needed `b`. --- or do this on paper with IEEE bits. – chux - Reinstate Monica Nov 02 '22 at 23:13
  • 1
    There's *lots* of ways of doing this. I can think of five right off the bat: (1) Read about [IEEE-754](https://en.wikipedia.org/wiki/IEEE_754) and construct the answer by hand. (2) Write a program using the `nextafter` function. (3) Fetch the bit pattern for 1.0, subtract 1, convert (or reinterpret) that bit pattern as floating-point again. (4) Write a program to find the answer empirically, by repeatedly dividing 1.0 by 2 (0.5, 0.25, 0.125, ...). (5) Read [this answer](https://stackoverflow.com/questions/71118404#74289666) I just posted this morning — you can find the answer in there! – Steve Summit Nov 02 '22 at 23:55
  • The way you wrote the question, the answer would be `-infinity`. But you are likely looking for the smallest *positive* number with that property. – chtz Nov 03 '22 at 07:34

1 Answers1

0

You'll need to consider and understand rounding (to nearest even is default) and the least significant bit (L), the Guard bit (G), the Round bit, and the Sticky bit (S), as well as the float internal format.

                          LGRS 
  1.23456789012345678901234 123456789012345678901234
  1.00000000000000000000000000                       * 2^0  // 1
 -0.000000000000000000000000100000000000000000000001 * 2^0  // b
-------------------------------------------------------

==>

                          LGRS 
  1.23456789012345678901234 123456789012345678901234
  0.11111111111111111111111112                      * 2^0  // borrow
 -0.00000000000000000000000011                      * 2^0  // b truncated with sticky (S) bit set due to the 1 at far right getting underflowed-out while aliagning binary point
-------------------------------------------------------
  0.11111111111111111111111101 ==>
  0.111111111111111111111111010                
                           LGRS
                           ^ 1 here means result is odd, so final GRS = 100 will round back up to 1, so 1-b == 1 (b too small), so we need 1 less (which will be 1 more in b) giving GRS = 011.  Since we cannot directly control the sticky bit, we settle for 010, which is what we have! 


0.00000000000000000000000011 * 2^0 =
1.0000000000000000000001x2^-25 smallest

x - 127 = -25
x = 127 - 25 =
102 =
01100110 (in binary) =


s exp      significand
0 01100110 00000000000000000000001 =  
0011 0011 0000 0000 0000 0000 0000 0001 =
0x33000001 (formatted as a float)
[33000000 is too small]

in Decimal (from python shell):
>>> print("%100.100f" % ((1.0 + 2**-23) * 2**-25))
0.0000000298023259404089913005009293556213378906250000000000000000000000000000000000000000000000000000
>>> print("%100.100f" % ((1.0) * 2**-25))
0.0000000298023223876953125000000000000000000000000000000000000000000000000000000000000000000000000000

Test:
Mac_3.2.57$cat smallFloat.c
#include <stdio.h>

int main(void){
    float a = 1.0;
    float B = 0.0000000298023259404089913005009293556213378906250000000000000000000000000000000000000000000000000000; // smallest
    float b = 0.0000000298023223876953125000000000000000000000000000000000000000000000000000000000000000000000000000; // too small
    printf("a - B = %100.100f\n", a - B);
    printf("a - b = %100.100f\n", a - b);

    return(0);
}
Mac_3.2.57$cc smallFloat.c
Mac_3.2.57$./a.out 
a - B = 0.9999999403953552246093750000000000000000000000000000000000000000000000000000000000000000000000000000
a - b = 1.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Mac_3.2.57$
Andrew
  • 1
  • 4
  • 19