Why are multiplication algorithms needed if hardware already does it?

Question

I was learning about the Karatsuba algorithm for fast integer multiplication and wondered, since computers already have dedicated hardware built into CPUs to do multiplication why would this algorithm be necessary?

Is it that large numbers are hard to multiply, but the algorithm breaks it down to simpler steps that are easier for the hardware to handle, because the hardware is good at multiplying smaller numbers?

Karatsuba algorithm is still not fast enough for large inputs. You may rather learn multiplication algorithm based on Fast Fourier transform: https://en.wikipedia.org/wiki/Multiplication_algorithm#Fourier_transform_methods — Serge Rogatch, Feb 03 '16 at 09:16

score 2 · Answer 1 · edited May 23 '17 at 12:15

all CPU has fixed bit width for ALU/FPU.

for example on i80x86 (PC) is the ALU limited to:
```
i8086+   16 bit
i80386+  32 bit
x64 arch 64 bit
```
allowing to compute only up to 16/32/64 bit numbers as operands. i80x87 FPU use 80 bit number representations which is converted from/to IEEE float(32bit)/double(64bit) limiting the precision.
If you need to compute numbers with bigger bit width then HW limit

then you need to break it down to chunks computable on ALU/FPU (and handle them as digits of a number) and combine their results to the final value. The ALU is counting with this and that is why CPU's have Carry flag and ALU support adding and substracting with carry. Now if you are doing simple +/- then you just add/subs all the digits from lowest (LSW) to highest (MSW) propagating the carry. see:
- Cant make value propagate through carry
The multiplication and division are more complex and you need to use long algorithms (like you computing on paper) which are usually O(n^2). Where n is the number of "digits". One digit is usually 8/16/32/64 bit number or its biggest 10^m base number. while you are computing small numbers (up to few 100x bits) then there is no gain with more advanced algorithms because they have too much overhead. For bigger numbers the situation turns in favor of them. see:
- Fast bignum sqr
Computing big floating point numbers is tricky and often faster done on integer arithmetics ALU instead in FPU in general. But still in some occasions you can benefit from FPU if you break values into more variables for example to enhance accuracy while summing/integrating see:
- Is it possible to make realistic n-body solar system simulation in matter of size and mass? especially the last edit

score 1 · Answer 2 · answered Feb 03 '16 at 05:48

1

These sorts of algorithms tend to pay off only for multiple-precision numbers - which are actually of use for things like RSA. Regardless of whether they pay off or not, there is theoretical interest in working out the best algorithms for multiple precision arithmetic.

Of course, hardware needs to be designed as well, and people have sometimes used division algorithms which are not those taught to children beginning arithmetic. Wikipedia suggests https://en.wikipedia.org/wiki/Division_algorithm#SRT_division, which is not exactly high tech. There are a few examples of proposed or even actualy use of Newton-Raphson iterations for division.

answered Feb 03 '16 at 05:48

mcdowella

19,301
2
19
25

1

There has even been support for multiple precision arithmetic in recent instruction sets: [addition using two carry bits](https://en.wikipedia.org/wiki/Intel_ADX) and a multiplication to go with it, for one. For an entirely frustrating experience, try to have Java BigInteger square significantly faster than multiplying two number of the same value in a JVM using such support for multiplication, but not for squaring (which should take about two thirds of the time for multiplication). – greybeard Feb 03 '16 at 06:46
What is "multiple precision numbers"? Are you referring to arbitrary precision arithmetic https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic – Celeritas Feb 03 '16 at 07:50
1

@Celeritas the "multiple precision numbers" are usually reference to numbers composed of base blocks to allow segmented processing like `BYTE/WORD/DWORD/QWORD...` and usually are separable to low and high half – Spektre Feb 03 '16 at 08:58
@Celeritas, yes that's what I'm referring to – mcdowella Feb 03 '16 at 19:05

score 1 · Answer 3 · answered Feb 03 '16 at 09:18

Why are multiplication algorithms needed if hardware already does it?

Because hardware doesn't already do it. Hardware does at best 64- or 128-bit multiplication. The Karatsuba algorithm you mention doesn't begin to be useful until you have numbers many orders of magnitude larger.

Why are multiplication algorithms needed if hardware already does it?

3 Answers3