1

I'm trying to find a way to catch overflow/underflow accurately when multiplying two signed binary numbers (a and b) of n arbitrary bits in an efficient manner. I'm using 2's complement.

Is there a way to do so during execution of the booth's algorithm? If so, how?

Things that I considered but can't use:

  1. GCC's built-in overflow detection: as my use case deals with arbitrary bit lengths which may or may not be castable to base signed integer types.

  2. Post-multiplication check using division: I don't want to be using division afterwards to check result r with r / b == a. That would be too computationally expensive.

  3. Traditional multiplication by adding: Too computationally inefficient.


To make it a little clearer on how I've approached Booth's algo here the step-by-step on a couple of examples using n=8bits big-endian to keep things readable.

The 'booth' bit is added to the register on the right and an extra bit to handle the negative integer limit case is added on the left.

so the register structure is:

 x                     q
|.|.... ....|.... ....|.|

where x = extra bit and q the "booth" bit. So the total size ends up as 8+8+2 bits.

e.g.1:

a: 0000 0101 (5)
b: 1111 1101 (-3)

       m: 0|0000 0101|0000 0000|0 (a)
      -m: 0|1111 1011|0000 0000|0 (negated a)
register: 0|0000 0000|1111 1101|0 (initialized register with b)

step 1 (p+=-m): 0|1111 1011|1111 1101|0
step 1 (shift): 0|0111 1101|1111 1110|1
step 2 (p+=+m): 0|1000 0010|1111 1110|1
step 2 (shift): 0|0100 0001|0111 1111|0
step 3 (p+=-m): 1|0011 1100|0111 1111|0
step 3 (shift): 0|1001 1110|0011 1111|1
step 4 (shift): 0|0100 1111|0001 1111|1
step 5 (shift): 0|0010 0111|1000 1111|1
step 6 (shift): 0|0001 0011|1100 0111|1
step 7 (shift): 0|0000 1001|1110 0011|1
step 8 (shift): 0|0000 0100|1111 0001|1

result:         .|.... ....|1111 0001|. = -15 //Cool

e.g.2:

a: 0011 1111 (63)
b: 1111 1101 (-3)

       m: 0|0011 1111|0000 0000|0 (a)
      -m: 0|1100 0001|0000 0000|0 (negated a)
register: 0|0000 0000|1111|1101|0 (initialized register with b)

step 1 (p+=-m):  0|1100 0001|1111 1101|0
step 1 (shift):  0|0110 0000|1111 1110|1
step 2 (p+=+m):  0|1001 1111|1111 1110|1
step 2 (shift):  0|0100 1111|1111 1111|0
step 3 (p+=-m):  1|0001 0000|1111 1111|0
step 3 (shift):  0|1000 1000|0111 1111|1
step 4 (shift):  0|0100 0100|0011 1111|1
step 5 (shift):  0|0010 0010|0001 1111|1
step 6 (shift):  0|0001 0001|0000 1111|1
step 7 (shift):  0|0000 1000|1000 0111|1
step 8 (shift):  0|0000 0100|0100 0011|1

result:          .|.... ....|0100 0011|. = 35 //Wrong! underflow.

Thanks.

  • take a look at related QA: [How to deal with overflow and underflow?](https://stackoverflow.com/a/33006665/2521214). btw corectly coded multiplication (binary, naive `O(n^2)`, Karatsuba, ... ) never overflow ... so if yours does it mean you got a bug somewhere so show us your code... see: [Fast bignum square computation](https://stackoverflow.com/q/18465326/2521214). Or did you mean by overflow that the result does not fit to a fixed bit width limit ? Simple carry flag from the multiplication will tell you that ... – Spektre Apr 01 '19 at 07:48
  • @Spektre Technically as the register is at least `2`x`n` it will not overflow during the multiplication. My issue is that when the `n`-bit bound result is then extracted from the register it may not always be correct when it doesn't fit into `n` bits. If I understand you correctly; in regards to the carry flag, it wont do as I'm not using base type or casting to them. All binary representations are kept in their own datastructure and not in `int`, `long`, etc types as my use cases will exceed those type's sizes (i.e. 256+ bits) so that's not going to work (unless you meant something else). – Keyboard embossed forhead Apr 01 '19 at 09:12
  • You can mimic ALU Carry flag in C++ too (yes it is slightly slower but the benefits are much bigger in weight) see [Cant make value propagate through carry](https://stackoverflow.com/a/26603589/2521214) and then just stack up any bitwidth from your base ALU datatype ... – Spektre Apr 01 '19 at 13:49
  • If you have any kind of multiplication primitive available to you at all, then you shouldn't switch to such a painfully slow method just to detect overflow. I don't know what to suggest, though, because your other criteria are unclear. What kind of multiplication would you use if you didn't have to worry about overflow? – Matt Timmermans Apr 01 '19 at 17:15
  • @Keyboardembossedforhead just now I see you are using 2'os complement so if `x=1` and `7th` bit of result is `0` like: `1 ???????? 0??????? ?` you "underflowed" because sub result is bigger then 7bit and negative howecer your sample result is weird ... overflow is when any bit above `7th` is set but `x=0` ... **btw.** to comment user `nick` you need to add `@nick` to your comment that will notify the user (works only one user per cmment and (s)he must be present in the thread as author or commenter ...) – Spektre Apr 02 '19 at 07:49
  • I'm essentially treating a list of `n` bits as a custom signed integer type (think bitset). I need to make multiplication as efficient and portable (i.e. not CPU dependent) as I can whilst being able to throw std::overflow_error and std::underflow_error for when the multiplication leads to going over the min and max respectively and thus results in a wrong answer... This overflow detection is a hard constraint. – Keyboard embossed forhead Apr 02 '19 at 09:19
  • @Spektre Thanks for the ALU link but I'm a little lost in the code as the shorthand var names in the multiplication method mean little to me. On top of only been introduced to that low level stuff recently. – Keyboard embossed forhead Apr 02 '19 at 09:30
  • @Keyboardembossedforhead if you mean `void ALU32::mul(DWORD &ch,DWORD &cl,DWORD a,DWORD b)` in the `#ifdef _ALU32_no_asm` configuration it uses naive `O(n^2)` multiplication. first it decompose 32bit `a,b` operands into pairs of `16 bit` (so the multiplication sub result fits 32bit) and then it simply compute the equation: `a*b = (al+65536*ah)*(bl+65536*bh)` while make sure it still fits into 4 16+ bit output values `c0 .. c3` once you realized `65536^i` constants are just addressing `i=0->c0 i=1->c1 ...` and in the end the `c0...c3` are composed back into 2 32 bit output values `cl,ch` – Spektre Apr 02 '19 at 09:48
  • @Keyboardembossedforhead and if you mean `l,h` it means `low, high` ... part of a number. You can look at the multiplication in the same way as you multiply on the paper but instead `0..9` digits you got `0..65535` digit/values .... so inputs are both 2 digit and output is 4 digit ... Anyway the multiplication above is unsigned !!! so in order to use it first multiply `|a|*|b|` and set the sigh of result from the signs of the original inputs ... – Spektre Apr 02 '19 at 09:50

1 Answers1

1

-3 by 63 Multiplication using Booth's Algorithm

-3 by 63 Multiplication using Booth's Algorithm

63 by -3 Multiplication using Booth's Algorithm.
In this case, sign bit will be ignored.

63 by -3 Multiplication using Booth's Algorithm

Shariful Islam Mubin
  • 2,146
  • 14
  • 23