1

I'm trying to measure thread switch overhead time. I have two threads, a shared variable, a mutex lock, and two condition variables. Two threads will switch back and forth to write either a 1 or a 0 to the shared variable.

I'm assuming the pthread_cond_wait(&cond, &mutex) wait time is approximately equal to 2 x thread context switch time. Since if a thread1 has to wait for a condition variable it has to give up the mutex lock to the thread2->thread2 context switch-> thread2 performs its task and signals the condition variable to wake up the first thread ->context switch back to thread1->thread1 reacquires the lock.

Is my assumption correct?

My code is below:

#include <sys/types.h>
#include <wait.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/resource.h>
#include <dirent.h>
#include <ctype.h>
#include<signal.h>
#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <pthread.h>


int var = 0;

int setToZero = 1;

int count = 5000;

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t isZero = PTHREAD_COND_INITIALIZER;

pthread_cond_t isOne = PTHREAD_COND_INITIALIZER;


struct timespec firstStart; 

unsigned long long timespecDiff(struct timespec *timeA_p, struct timespec *timeB_p)
{
  return ((timeA_p->tv_sec * 1000000000) + timeA_p->tv_nsec) - 
           ((timeB_p->tv_sec * 1000000000) + timeB_p->tv_nsec);
}

void* thread1(void* param)
{ 

  int rc;
  struct timespec previousStart;
  struct timespec start; //start timestamp
  struct timespec stop; //stop timestamp
  unsigned long long result;
  int idx = 0;
  int measurements[count];
   clock_gettime(CLOCK_MONOTONIC, &stop);

   result = timespecDiff(&stop,&firstStart);

   printf("first context-switch time:%llu\n", result);

  clock_gettime(CLOCK_MONOTONIC, &previousStart);

  while(count > 0){

  //acquire lock
  rc = pthread_mutex_lock(&mutex);

  clock_gettime(CLOCK_MONOTONIC,&start);

  while(setToZero){
    pthread_cond_wait(&isOne,&mutex); // use condition variables so the threads don't busy wait inside local cache
  }

  clock_gettime(CLOCK_MONOTONIC,&stop);


   var = 0;

   count--;

   setToZero = 1;

   //printf("in thread1\n");

   pthread_cond_signal(&isZero);
    //end of critical section
   rc = pthread_mutex_unlock(&mutex); //release lock

    result = timespecDiff(&stop,&start);

    measurements[idx] = result;

    idx++;
 }

 result = 0;

 int i = 0;
while(i < idx)
 {
   result += measurements[i++];
 }

 result = result /(2*idx);

 printf("thread1 result: %llu\n",result);
}


void* thread2(void* param)
{
  int rc;
  struct timespec previousStart;
  struct timespec start; //start timestamp
  struct timespec stop; //stop timestamp
  unsigned long long result;
  int idx = 0;
  int measurements[count];

  while(count > 0){

  //acquire lock
  rc = pthread_mutex_lock(&mutex);

  clock_gettime(CLOCK_MONOTONIC,&start);

  while(!setToZero){
    pthread_cond_wait(&isZero,&mutex);
  }

  clock_gettime(CLOCK_MONOTONIC,&stop);

   var = 1;

   count--;

   setToZero = 0;

   //printf("in thread2\n");

   pthread_cond_signal(&isOne);
    //end of critical section
   rc = pthread_mutex_unlock(&mutex); //release lock

   result = timespecDiff(&stop,&start);

   measurements[idx] = result;

   idx++;
  }

 result = 0;

 int i = 0;
while(i < idx)
 {
   result += measurements[i++];
 }

 result = result /(2*idx);

 printf("thread2 result: %llu\n",result);
}

int main(){
  pthread_t threads[2];

  pthread_attr_t attr;

  pthread_attr_init(&attr);

  clock_gettime(CLOCK_MONOTONIC,&firstStart);

  pthread_create(&threads[0],&attr,thread1,NULL);

  pthread_create(&threads[1],&attr,thread2,NULL);

  printf("waiting...\n");

  pthread_join(threads[0],NULL);

  pthread_join(threads[1],NULL);

  pthread_cond_destroy(&isOne);

  pthread_cond_destroy(&isZero);

}

I get the following times:

first context-switch time:144240
thread1 result: 3660
thread2 result: 3770
JJTO
  • 847
  • 2
  • 8
  • 13
  • Unless you only have a single CPU core, the threads may not need to be context switched *at all* in this case. – EOF Feb 15 '17 at 08:01
  • I'm using a single core machine – JJTO Feb 15 '17 at 08:02
  • And then there is the fact that there are other processes/threads demanding cpu time. Context switch is the loading/unloading of state variables etc., and i don't think you can measure it, you might even suffer from the observer effect :) The OS is responsible from context switch, and in user space you don't have control over it, it all happens while your process is sleeping. – Selçuk Cihan Feb 15 '17 at 08:21
  • @selcuk Cihan: I don't think that's necessarily true. There are other SO posts regarding this topic:http://stackoverflow.com/questions/304752/how-to-estimate-the-thread-context-switching-overhead?rq=1 – JJTO Feb 15 '17 at 18:31

1 Answers1

0

You say:

I'm assuming the pthread_cond_wait(&cond, &mutex) wait time is approximately equal to 2 x thread context switch time.

This is not a valid assumption. Once the mutex is released, this notifies the kernel, which then has to wake the other thread. It may not choose to do that immediately for example if there are other threads waiting to run. The mutex - as its name suggests - guarantees when things will not happen. It makes no guarantees about when they will.

You can't expect to reliably measure context switches from within a process, and certainly not using a Posix API because there is none which promises to do that.

  • On Linux you can count context switches for a process or thread using /proc/[pid]/status.

  • On Windows this information is available from the Performance Monitor API.

Whether either of these will get you towards your goal I don't know. I suspect what you really want to know is how much using a multithreaded system affects performance, but that'll require you to measure the performance of the application as a whole.

Ben
  • 34,935
  • 6
  • 74
  • 113
  • There are easier APIs to use for counting context switches: see EnableThreadProfiling on Windows and getrusage on Linux/BSD. – rdb Feb 22 '22 at 16:55