c++: simple multi-threading example not faster than single thread

Question

I wrote a very simple example for multithreading in C++. How comes that multithreading and single threading have approximatly the same execution time?

CODE:

#include <iostream>
#include <thread>
#include <ctime>

using namespace std;

// function adds up all number up to given number
void task(int number)
{
    int s = 0;
    for(int i=0; i<number; i++){
        s = s + i;
    }
}

int main()
{

    int n = 100000000;

    ////////////////////////////
    // single processing      //
    ////////////////////////////

    clock_t begin = clock();

    task(n);
    task(n);
    task(n);
    task(n);

    clock_t end = clock();
    double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
    cout  << "time single-threading: "<< elapsed_secs << " sec" << endl;    

    ////////////////////////////
    // multiprocessing        //
    ////////////////////////////

    begin = clock();

    thread t1 = thread(task, n);
    thread t2 = thread(task, n);
    thread t3 = thread(task, n);
    thread t4 = thread(task, n);

    t1.join();
    t2.join();
    t3.join();
    t4.join();

    end = clock();
    elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
    cout << "time multi-threading:  " << elapsed_secs << " sec" << endl;

}

for me the output of the program is

time single-threading: 0.755919 sec 
time multi-threading:  0.746857 sec

I compile my code with

g++ cpp_tasksize.cpp -std=c++0x -pthread

And I run on a 24-core-linux-machine

Isn't it simply because the compiler trivially optimizes `task` function? As in `task` is equivalent to `void task(int number) {}`. — freakish, Jun 11 '17 at 11:09
no I don't think so. I can add `cout << s << endl;` at the end of the `task` function. Then the sum is printed out at each execution, but still the same time for multi/single processing — Oliver Wilken, Jun 11 '17 at 11:13
@OliBlum There's a different issue with `cout`: AFAIK it is (somewhat) thread safe, meaning that only one thread at a time can flush buffer. Depending on implementation this might mean that other threads are locked. And since the loop will be optimized anyway and the thread spends most of the time on `cout`ing then you have similar times. Try using `std::this_thread::sleep_for` instead and see what happens. — freakish, Jun 11 '17 at 11:24
There is no guarantee that multiple threads will be faster than a single thread. If you are not computationally bound, it is, in fact, likely that the overhead of multiple threads will make your application slower. And if you have to synchronize the operation of multiple threads (e.g., because you're doing something like writing to the console), it is virtually *guaranteed* that multiple threads will show no improvement over a single thread. — Cody Gray - on strike, Jun 11 '17 at 11:36
simple. creating the threads takes more time than to actually run the code. when you want to make your program faster by using thread, than create a thread pool when the program goes up and reuse that pool. thread creation is not cheap. — David Haim, Jun 11 '17 at 11:49
@DavidHaim Thread creation has some overhead but certainly not "_more time than to actually run the code_" in this case. [This answer](https://stackoverflow.com/a/26515727/7571258) suggests it's in the microsecond range. — zett42, Jun 11 '17 at 13:56

nglee · Accepted Answer · 2017-06-11T15:19:47.520

5

clock() measures processor time, the time that your process spent on your cpu. In a multi-thread program it will add up time each thread spent on your cpu. Your single-thread and multi-thread implementations are reported to take about same time to run, since they are doing equal number of calculations overall.

What you need is to measure wall clock time. Use chrono library when you want to measure wall clock time.

#include <chrono>

int main ()
{
    auto start = std::chrono::high_resolution_clock::now();

    // code section

    auto end = std::chrono::high_resolution_clock::now();
    std::cout << std::chrono::duration<double, std::milli>(end - start).count() << " ms\n";
}

edited Jun 11 '17 at 15:19

answered Jun 11 '17 at 11:16

nglee

1,913
9
32

I did not quite understand why, but with chrono it works out. Is the `clock()` function so unprecise? A little explanation would be nice – Oliver Wilken Jun 11 '17 at 11:30
1

@OliBlum Edited answer. Hope it helps. – nglee Jun 11 '17 at 11:47
1

@OliBlum `clock()` is supposed to accumulate the total CPU 'ticks' consumed by the process and will (say) advance 4 times faster than a wall clock with 4 threads running. It isn't 'elapsed' time. It's consume CPU time. Refer to: http://en.cppreference.com/w/cpp/chrono/c/clock NB: There's a long standing bug on MSVC (that they decline to fix) that it returns wall-clock ticks. But you already mentioned you're using Linux. – Persixty Jun 11 '17 at 12:14

c++: simple multi-threading example not faster than single thread

1 Answers1