Double multiplication independent from the length of the variable?

Question

I was wondering if the length of a double variable causes an impact on the multiplication time. For testing purposes, I wrote the folowing small piece of code:

#include <iostream>
#include <time.h>

int main() {
    double x = 123.456;
    double a = 1.23456;
    double b = 1.234;
    double d = 1.0;

    // experiment 1
    clock_t start = clock();
    for( unsigned long long int i = 0; i < 10000000; ++i ) {
        x *= a;
    }
    clock_t end = clock();
    std::cout << "123.456*1.23456 takes " << (double)(end-start)/CLOCKS_PER_SEC << " secs" << std::endl;

    // experiment 2
    start = clock();
    for( unsigned long long int i = 0; i < 10000000; ++i ) {
        x *= b;
    }
    end = clock();
    std::cout << "123.456*1.234 takes " << (double)(end-start)/CLOCKS_PER_SEC << " secs" << std::endl;

    // experiment 3
    start = clock();
    for( unsigned long long int i = 0; i < 10000000; ++i ) {
        x *= d;
    }
    end = clock();
    std::cout << "123.456*1.0 takes " << (double)(end-start)/CLOCKS_PER_SEC << " secs" << std::endl;

    return 0;
}

I compiled it using VS2008, 64bit in release mode, without optimization and debug information. The result was not surprising: all three kinds of multiplication last exactly the same time, with difference in just a few milliseconds. My question is: why is that so? If I make a mistake and multiply a number by 1.0 instead of 1, and I do not use any compiler optimization, then my multiplication will last much longer than multiplying a number by 1! When humans multiply, then the shorter the number, the faster we come to the result. How does computer multiply so that it does not matter, how long the two numbers are?

Apart from that, I decided to check whether debugging influences runtime speed. In this case, it does not: compiling with \DEBUG option or without, multiplication always takes exactly the same amount of time.

With optimization enabled \O2, the same multiplication lasts only a thousandth of a second. What does optimization do in this case? How can such a compact code of multiplying two doubles in C++ be optimized?

I would be grateful for any explanation, what happens during double multiplication in C++.

What do you mean by "length"? the size of a double variable is always the same. — PlasmaHH, Mar 01 '12 at 16:08
C++? I think you should take a look at what happens in the CPU. — Karoly Horvath, Mar 01 '12 at 16:08
possible duplicate of [How computer does floating point arithmetic?](http://stackoverflow.com/questions/6033184/how-computer-does-floating-point-arithmetic) Have a look at the answers with more upvotes than the accepted answer and at the recommended material. — pmr, Mar 01 '12 at 16:09
A double doesn't even have a "length" (unless you count size as length) — harold, Mar 01 '12 at 17:08
[What Every Computer Scientist Should Know About Floating-Point Arithmetic](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) — AShelly, Mar 01 '12 at 20:51
@PlasmaHH By length, I mean the number of digits after comma — Pavlo Dyban, Mar 02 '12 at 07:42
@PavloDyban: Are you aware of that the number of decimal digits does not really correlate to the number of bits set in a double representation? — PlasmaHH, Mar 02 '12 at 09:02
@PlasmaHH Yes, I knew that. I thought that there might exist an algorithm on a CPU level which would omit trailing zeros in the number's binary representation. — Pavlo Dyban, Mar 02 '12 at 14:25

score 2 · Accepted Answer · answered Mar 01 '12 at 16:13

The variables have always the same length, but the values differ. With other words: the operations to perform at hardware level are the very same, thus resulting. E.g., an integer, multiplying by 0 (zero out result, i.e. transfer 0s into destination register) takes the same time as multiplying by 1 (copy operand into destination register).

score 2 · Answer 2 · answered Mar 01 '12 at 17:33

Floats (single precision) are 32bits and Doubles are 64bits.

http://en.wikipedia.org/wiki/IEEE_754-2008

On an Intel/AMD processor... The FPU (x87) or SIMDD (SSEx) will calculate the MULtiplication in a constant amount of cycles. The speed is based on throughput, underlying operations, and latency.

http://www.agner.org/optimize/instruction_tables.pdf

FMUL - Multiply FPU x87
MULSS - Multiply Scalar Single SSEx
MULDS - Multiply Scalar Double SSEx

score 1 · Answer 3 · answered Mar 01 '12 at 16:19

1

With optimization enabled \O2, the same multiplication lasts only a thousandth of a second. What does optimization do in this case?

Since you never use your result (x) it's completely valid to eliminate all the multiplications. Try displaying the results.

Also note, that you're doing 10M multiplications, with a modern processor you have at least 1G of clock cycles per second, and in this case it executes very tight loop.

answered Mar 01 '12 at 16:19

Karoly Horvath

94,607
11
117
176

I added output of `x` after each for-cycle, and the results changed: both double multiplications last around 1.06 secs, whereas multiplication with `1.0` is optimized and fully taken away. What do you mean by a tight loop? – Pavlo Dyban Mar 02 '12 at 07:51

score 1 · Answer 4 · answered Mar 01 '12 at 20:38

When humans multiply, then the shorter the number, the faster we come to the result. How does computer multiply so that it does not matter, how long the two numbers are?

Computers used to work the same way that humans did, calculate one digit at a time, and adding up the partial results to get the answer. However the amount of hardware that can be packed into a single chip has advanced to the point where it's possible to have a circuit dedicated to each digit so it calculates each digit at the same time in parallel. Of course it's all in binary, but the principle is the same.

Double multiplication independent from the length of the variable?

4 Answers4