Suppose I write,
int a = 111;
int b = 509;
int c = a * b;
So what is the time complexity to compute 'a * b' ? How is the multiplication operation executed?
Suppose I write,
int a = 111;
int b = 509;
int c = a * b;
So what is the time complexity to compute 'a * b' ? How is the multiplication operation executed?
Compiling this function:
int f(int a, int b) {
return a * b;
}
With gcc -O3 -march=native -m64 -fomit-frame-pointer -S
gives me the following assembly:
f:
movl %ecx, %eax
imull %edx, %eax
ret
The first instruction (movl
) loads the first argument, the second instruction (imull
) loads the second argument and multiplies it with the first - then the result gets returned.
The actual multiplication is done with imull
, which - depending on your CPU type - will take a certain amount of CPU cycles.
If you look at Agner Fog's instruction timing tables you can see how much time each instruction will take. On most x86 processors it seems to be a small constant, however the imul
instruction on the AMD K8 with a 64 bit argument and result shows as 4-5
CPU cycles. I don't know if that's a measurement issue or really variable time however.
Also note that there's other factors involved than just the execution time. The integer has to be moved through the processor and get into the right place to get multiplied. All of this and other factors make latency, which is also noted in Agner Fog's tables. There are other issues such as cache issues which also make life more difficult - it's not that easy to simply say how fast something will run without running it.
x86 isn't the only architecture, and it's actually not inconceivable there are CPU's and architectures out there that have non-constant time multiplication. This is especially important for cryptography where algorithms using multiplication might be susceptible to timing attacks on those platforms.
Multiplication itself on most common architectures will be constant. Time to load registers may vary depending on the location of the variables (L1, L2, RAM, etc) but the number of cycles operation takes will be constant. This is in contrast to operations like sqrt
that may require additional cycles to achieve certain precision.
you can get instruction costs here for AMD, Intel, VIA: http://www.agner.org/optimize/instruction_tables.pdf
By time complexity, I presume you mean whether it depends on the number of digits in a and b? So whether the number of CPU clock cycles would vary depending on whether you multiplied say 2*3 or 111*509. I think yes they would vary and it would depend on how that architecture implements the multiplication operation and how the intermediate results are stored. Although there can be many ways to do this one simple/primitive way is to implement multiplication using the binary adder/subtractor circuit. Multiplication of a*b is adding a to itself b times using n-digit binary adders. Similarly division a/b is subtraction b from a until it reaches 0, although this will take more space to store the quotient and remainder.
void myfun()
{
int a = 111;
int b = 509;
int c = a * b;
}
De assemble part :
movl $111, -4(%ebp)
movl $509, -8(%ebp)
movl -4(%ebp), %eax
imull -8(%ebp), %eax
So as you can see it all depends on imull
instruction, specifically the fetch, decode, and execute cycle of a CPU.
In your example, the compiler would do the multiplication and your code would look like
int c = 56499;
If you changed your example to look like
int c = a * 509;
then the compiler MIGHT decide to rewrite your code like
int c = a * ( 512 - 2 - 1 );
int c = (a << 9) - (a << 1) - a;
I said might because the compiler will compare the cost of using shift to the cost of a multiply instruction and pick the best option. Given a fast multiply instruction, that usually means only 1 or maybe 2 shift will be faster.
If your numbers are too large to fit in an integer (32-bits) then the arbitrary precision math routines use between O(n^2) and O(n log n) time where n is the number of 32-bit parts needed to hold the numbers.