0

I have two applications: one is multithreaded and one is completely sequential. Both applications perform the same task. I am trying to calculate the speedup of the multithreaded application. Somehow, I am getting more wall time for multithreaded code (147.19 us) as compared to sequential one (41 us). Is there any other way for wall time profiling?

#include "iostream"
#include "ctime"
#include "thread"
#include "chrono"
#include "iomanip"

#include <sys/time.h>

int deltaTime(struct timeval *tv1, struct timeval *tv2){
    return ((tv2->tv_sec - tv1->tv_sec)*1000000)+ tv2->tv_usec - tv1->tv_usec;
}

void execute_for_wallTime(int wall_time) 
{
    struct timeval  tvStart, tvNow;
    gettimeofday(&tvStart, NULL);

    for (int m = 0; wall_time; ++m){
      gettimeofday(&tvNow, NULL);
      if(deltaTime(&tvStart,&tvNow) >=wall_time) { 

        return;
      }
   } 
}

int sc_main(int argc, char* argv[])
{
   std::clock_t c_start = std::clock();
   auto t_start = std::chrono::high_resolution_clock::now();


   //multi-thread code 

   // std::thread t1(execute_for_wallTime, 10);
   // std::thread t2(execute_for_wallTime, 13);
   // std::thread t3(execute_for_wallTime, 16);
   // t1.join();
   // t2.join();
   // t3.join();

   //Sequential Code

   execute_for_wallTime(10);
   execute_for_wallTime(13);
   execute_for_wallTime(16);

   std::clock_t c_end = std::clock();
   auto t_end = std::chrono::high_resolution_clock::now();
   std::cout << std::fixed << std::setprecision(2) << "CPU time used: "
             << 1000.0 * (c_end-c_start) / CLOCKS_PER_SEC << " ms\n"
             << "Wall clock time passed: "
             << std::chrono::duration<double, std::micro>(t_end- 
                t_start).count()
             << " us\n";   

    return 0;
}
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
Zeeshan Hayat
  • 401
  • 6
  • 13
  • 6
    Have you considered that your multi threaded implementation maybe actually *is* slower than you single threaded one? – Baum mit Augen Jun 11 '18 at 11:14
  • 1
    What optimization levels are you using, what compiler and what OS if any? – Ron Jun 11 '18 at 11:15
  • 1
    Since you don't show the actual "code to run" it's impossible for us to say. You have to create a [Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve) to show us. – Some programmer dude Jun 11 '18 at 11:16
  • @Someprogrammerdude Kindly check now, It is a minimal complete example – Zeeshan Hayat Jun 11 '18 at 11:33
  • @Ron OS: Linux, Compiler: GCC 6.4.0 – Zeeshan Hayat Jun 11 '18 at 11:39
  • @BaummitAugen This might be the case, But I am not sure how to check that. I tried to calculated the wall time in each separate thread but it gives completely different value each time I run the application. If you check the above code. you will see that it gives more time for sepuential than parallel code. – Zeeshan Hayat Jun 11 '18 at 11:41
  • 1
    You're simply waiting to short time in your function. Try multiplying the time to wait by 1000 and you will see a difference in the other (and expected) direction. The extra time you have now for the threaded version is the time to set up, get running, and tear down the threads. – Some programmer dude Jun 11 '18 at 11:52
  • @Someprogrammerdude it makes sense now. Now, I can see some speedup. Thanks – Zeeshan Hayat Jun 11 '18 at 12:31

0 Answers0