1

I'm running a couple of threads in parallel. And I want to measure the time it takes to execute one thread and the time it takes to execute the whole program. I'm using VC++, on Windows 7.

I tried to measure it while debugging but then I saw this question: https://stackoverflow.com/questions/38971267/improving-performance-using-parallelism-in-c?noredirect=1#comment65299718_38971267 and in the answer given by Schnien it says:

Debugging of multiple threads is somehow "special" - when your Debugger halts at a breakpoint, the other threads will not be stopped - they will go on 

Is this true ? And if yes how can I otherwise measure the time

Thanks

Community
  • 1
  • 1
mata
  • 181
  • 1
  • 2
  • 10
  • Use built-in Profiler or Concurrency Visualizer: https://msdn.microsoft.com/en-us/library/dd537632.aspx – Mars Aug 16 '16 at 14:59

2 Answers2

2

That statement is indeed true, only the thread that hits a breakpoint will be paused.

However to measure execution times you do not have to use debugging at all. More information on measuring execution time can be found on the below question:

Measure execution time in C (on Windows)

What you would want to do is measure the time inside the threads' functions (by subtracting the time at the beginning and at the end of the functions). You can do the same with the program, you can use thread.join to make sure all the threads executions end before measuring the time one last time.

Community
  • 1
  • 1
adenzila
  • 46
  • 7
2

Use a simple timer class to create a stopwatch capability then capture the time within each thread. Also, creating system threads is slower than using std::async and the latter can both return values and propagate exceptions which, using threads cause program termination unless caught within the thread.

#include <thread>
#include <iostream>
#include <atomic>
#include <chrono>
#include <future>

// stopwatch. Returns time in seconds
class timer {
public:
    std::chrono::time_point<std::chrono::high_resolution_clock> lastTime;
    timer() : lastTime(std::chrono::high_resolution_clock::now()) {}
    inline double elapsed() {
        std::chrono::time_point<std::chrono::high_resolution_clock> thisTime=std::chrono::high_resolution_clock::now();
        double deltaTime = std::chrono::duration<double>(thisTime-lastTime).count();
        lastTime = thisTime;
        return deltaTime;
    }
};

// for exposition clarity, generally avoid global varaibles.
const int count = 1000000;

double timerResult1;
double timerResult2;

void f1() {
    volatile int i = 0; // volatile eliminates optimization removal
    timer stopwatch;
    while (i++ < count);
    timerResult1=stopwatch.elapsed();
}
void f2() {
    volatile int i = 0; // volatile eliminates optimization removal
    timer stopwatch;
    while (i++ < count);
    timerResult2=stopwatch.elapsed();
}

int main()
{
    std::cout.precision(6); std::cout << std::fixed;
    f1(); std::cout << "f1 execution time " << timerResult1 << std::endl;
    timer stopwatch;
    {
        std::thread thread1(f1);
        std::thread thread2(f2);
        thread1.join();
        thread2.join();
    }
    double elapsed = stopwatch.elapsed();
    std::cout << "f1 with f2 execution time " << elapsed << std::endl;
    std::cout << "thread f1 execution time " << timerResult1 << std::endl;
    std::cout << "thread f1 execution time " << timerResult2 << std::endl;
    {
        stopwatch.elapsed();    // reset stopwatch
        auto future1 = std::async(std::launch::async, f1); // spins a thread and descturctor automatically joins
        auto future2 = std::async(std::launch::async, f2);
    }
    elapsed = stopwatch.elapsed();
    std::cout << "async f1 with f2 execution time " <<  elapsed << std::endl;
    std::cout << "async thread f1 execution time " << timerResult1 << std::endl;
    std::cout << "async thread f1 execution time " << timerResult2 << std::endl;
}

On my machine creating threads adds about .3 ms per thread whereas async is only about .05 ms per thread as it is implemented with a thread pool.

f1 execution time 0.002076
f1 with f2 execution time 0.002791
thread f1 execution time 0.002018
thread f1 execution time 0.002035
async f1 with f2 execution time 0.002131
async thread f1 execution time 0.002028
async thread f1 execution time 0.002018

[EDIT] Had incorrect f calls in front of statements (cut and past error)

doug
  • 3,840
  • 1
  • 14
  • 18