1

I have learnt how to convert numbers to floating point (on top of binary, octal and hexadecimal), and know how to convert numbers to floating point.

However, while looking through a worksheet I have been given, I have encountered the following question:

Using 32-bit IEEE 754 single precision floating point show the representation of -12.13 in Hexadecimal.

I have tried looking at the resources I have and still can't figure out how to answer the above. The answer given is 0xc142147b.

Edit: Sorry for not clarifying but I wanted to know how to get this done by hand instead of coding it.

  • Are you supposed to do the calculations by hand, or using a particular programming language? – Mark Dickinson Mar 01 '19 at 17:24
  • @MarkDickinson By hand, sorry for not clarifying in the question. –  Mar 01 '19 at 17:32
  • 2
    Perhaps my answer about 5.2 can help: https://stackoverflow.com/questions/6910115/how-to-represent-float-number-in-memory-in-c/6911412#6911412 Not exactly -12.13, but it can be done in a similar way. Just don't forget the sign bit. – Rudy Velthuis Mar 01 '19 at 18:16
  • @RudyVelthuis Why would 1/13 be relevant here? Did you mean `0.13`? – Mark Dickinson Mar 01 '19 at 18:33
  • @Mark: Arrgh! yes, wrong, totally irrelevant. I'll remove the comment, as it is terribly misleading. Thanks. Don't know what I was thinking. – Rudy Velthuis Mar 01 '19 at 18:34

2 Answers2

0

-12.13 must be converted to binary and then hex. Let's do that more or less like the glibc library does it, using just pen and paper and the Windows calculator.

Remove the sign, but remember we had one: 12.13

Significand (or mantissa)

The integer part, 12 is easy: C (hex)

The fractional part, 0.13 is a little trickier. 0.13 is 13/100. I use the Windows calculator (Programmer mode, hex) and shift 13 (hex D) by 32(*) bits to the left: D00000000. Divide that by 100 (hex 64) to get: 2147AE14 hex.

Since we need a value below 1, we shift right by 32 bits again, and get: 0.2147AE14

Now add the integer part on the left: C.2147AE14

We only need 24 bits for the mantissa, so we round: C.2147B --> C2147B

Now this must be normalized, so the binary point is moved 3 bits to the left (but the bits remain the same, of course). The exponent (originally 0) is raised accordingly, by 3, so now it is 3.

The hidden bit can now be removed: 42147B (now the 23 low bits)

This can be turned into a 32 bit value for now: 0x0042147B

Exponent and sign

Now let's take on the exponent: 3 + bias of hex 7F = hex 82, or 1000 0010 binary.

Add the sign bit on the left: 1 1000 0010. Regrouped: 1100 0001 0 or C10

Of course these are top bits, so we turn that into 0xC1000000 for the full 32 bits

"Bitwise-Or" both parts

0xC100000 | 0x0042147B = 0xC142147B

And that is the value you want.


(*)32 bits so I have more than enough bits to be able to round properly, later on.

Rudy Velthuis
  • 28,387
  • 5
  • 46
  • 94
  • Thank you so much, I understand how to do it now. However, may I know if there is any way to derive the significand without using any sort of calculator? We've been told that calculators are not permitted during assessments. –  Mar 01 '19 at 23:10
  • 1
    Without calculator, it is a little more work, but doable: D00000000 / 64 = dec. 13 * 65536 * 65536 / 100. Then turn that into hex for the rest of the calculations. Perhaps you can do with 24 bits instead of 32: 13 * 256 * 65536 / 100, convert to hex (repeatedly divide by 16 and write down the remainders from right to left) and then only shift right by 24 bits. Then you only get `0.2147AE --> C.2147AE --> C2147B --> 42147B` (see answer). – Rudy Velthuis Mar 01 '19 at 23:48
0

To code a floating number, we must rewrite it as (-1)s 2e 1.m and to code the different parts in 32 bits as follows

enter image description here

(from https://en.wikipedia.org/wiki/Single-precision_floating-point_format)

  • First bit is the sign s: 0 for + and 1 for -

  • 8 following bits are the shifted exponent e+127

  • 23 last bits are the fractional part of the mantissa (m)

The hard part is to convert the mantissa to binary. For some numbers, it is easy. For instance, 5.75=4+1+1/2+1/4=22+20+2-1+2-2=101.11=1.0111×22

For other numbers (as yours), it is harder. The solution is to multiply the number by two until we find an integer or we exceed the total number of bits in the code (23+1).

We can do that for your number:

 12.13 =       12.13 2^-0
       =       24.26 2^-1
       =       48.52 2^-2
       =       97.04 2^-3
       =      194.08 2^-4
       =      388.16 2^-5
       =      776.32 2^-6
       =     1552.64 2^-7
       =     3105.28 2^-8
       =     6210.56 2^-9
       =    12421.12 2^-10
       =    24842.24 2^-11
       =    49684.48 2^-12
       =    99368.96 2^-13
       =   198737.92 2^-14
       =   397475.84 2^-15
       =   794951.69 2^-16
       =  1589903.38 2^-17
       =  3179806.75 2^-18
       =  6359613.50 2^-19
       = 12719227.00 2^-20

Next iteration would lead to a number larger than 2^24(=~16M), and we can stop.

Mantissa code is easy (but a bit long) to convert by hand to binary using usual methods, and its code is 0xc2147b. If we extract the leading bit at 1 in position 223 and put it left of "dot", we have mantissa=1.42147b×223 (where the fractional part is limited to 23 bits). As we had to multiply by the initial number by 220 to get this value, we finally have

mant=1.42147b×23

So exponent is 3 and its code is 3+127=130

exp=130d=0x82

and as number is negative

sign=1

We just have, to suppress the integer part of mantissa (hidden bit) and to concatenate this numbers to get final value of 0xc142147b

(Of course, I used a program to generate these numbers. If interested, here is the C code)

#include <stdio.h>
int main () {
  float f=-12.13;
  int sign=(f<0.0);
  float fmantissa;
  fmantissa = (f<0.0?-f:f) ; // abs value of f
  int   e = 0 ;              // the raw exponent
  printf("%2.2f = %11.2f 2^-%d\n",f,fmantissa,e);
  while (fmantissa<=(1<<23)){
    e++; fmantissa*=2.0;
    printf("       = %11.2f 2^-%d\n",fmantissa,e);
  }

  // convert to int
  int mantissa=fmantissa;
  //and suppress hidden bit in mantissa
  mantissa &= ~(1<<23) ;

  // coded exponent
  int exp=127-e+23;

  printf("sign: %d exponent: %d mantissa: 1.%x\n",sign, exp, mantissa);
  //final code
  int fltcode = (sign << 31) | (exp << 23) | mantissa;

  printf("0x%x\n",fltcode);
}
Alain Merigot
  • 10,667
  • 3
  • 18
  • 31
  • I used nothing but Windows calculator and pen and paper. Works too. – Rudy Velthuis Mar 01 '19 at 21:05
  • Of course it works. Actually, I wanted to show OP what could be an algorithm to compute it entirely by hand (even if it is easier with a computer). The time that I write the program to generate numbers, I saw your answer, but both are complementary, I think. – Alain Merigot Mar 01 '19 at 21:16
  • FWIW, the 1.42147B part is wrong. That would be 1.0100 0010 0001 0100 0111 1011, but the previous bit pattern was 1100 0010 0001 etc... The bit pattern doesn't change, just the placement of the binary point. So the temporary intermediate should be 1.1000 0100 0010 etc. = 1.842 etc... I left out that bit, because it is confusing. The 24 bits remain the same, after all. – Rudy Velthuis Mar 01 '19 at 23:20
  • You are right, and I used this as shortcut. It seemed clear for me, but it may be confusing. I have added a precision in the answer. Thanks for spotting it. – Alain Merigot Mar 01 '19 at 23:55