I'm trying to understand the algorithm for floating point addition. In the past I've only had to do this on paper, and did it by converting it to decimal and back again. I am writing a floating point ALU in an HDL so that won't work in this case. I've read a lot of the questions on the topic, the most useful of which I've used for this example, and read many articles, but some concepts elude me. I've written the questions in context below, but for summary here they are at top:
- When is the implicit bit in the mantissa 0, and when is it 1?
- After the addition, how do we algorithmically check for normalization, and then determine which way to shift?
- If one of the numbers is negative, is the subtraction of the mantissa performed in 2s compliment or not?
Borrowing from this example:
00001000111100110110010010011100 (1.46487e-33)
00000000000011000111111010000100 (1.14741e-39)
First split them into their components (sign, exp, mantissa)
0 00010001 11100110110010010011100
0 00000000 00011000111111010000100
Next tack on their implicit integer value
0 00010001 1.11100110110010010011100
0 00000000 0.00011000111111010000100
Question 1: Is the reason for the zero integer in front of the 2nd value because the exponent is zero
Next subtract the greater exponent from the lesser and shift the lesser mantissa right by that amount
00010001
- 00000000
___________
00010001 = 17
0.00000000000000000000110
Add the mantissas
0.00000000000000000000110
+ 1.11100110110010010011100
______________________________
1.11100110110010010100010
Question 2: In this case the MSB is 1 so the value is normalized and we can drop it. Suppose that it weren't. If the MSB was 0 would that still be considered a normalized value or would we shift left to get a 1 in that place?
Question 3: Suppose one of the numbers was negative, is subtraction performed in 2s compliment, or is it enough to simply subtract the mantissas as they are?