I have a C program which creates two threads (apart from main), T1 and T2. T1 executes a function which issues an operation O1 and T2 executes a function which issues an operation O2.
void* f1() {
O1();
var = 0;
}
void* f2() {
O2();
var = 1;
}
int main(int argc, char **argv){
pthread_t t1, t2;
int var;
pthread_create(&t1, NULL, &f1, NULL);
pthread_create(&t2, NULL, &f2, NULL);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
printf("var = %d\n", var);
return 0;
}
t1
and t2
each get assigned to different physical cores. The objective of this program is to check which operation was faster by inspecting the value of var
after both the threads have finished executing. This would require that O1() and O2() get run at the exact same time (or with a very slight tolerable difference in the order of few cycles) in parallel on both cores. How can I go about ensuring this?
Edit: Based on Peter Cordes' suggestion, I've modified f1()
and f2()
to read the timestamp for synchronized execution of O1()
and O2()
.
void* f1() {
t1 = rdtsc();
while(t1 != 0){
t1 = rdtsc();
}
printf("t1 = %d\n", t1);
O1();
var = 0;
}
void* f2() {
t2 = rdtsc();
while(t2 != 0){
t2 = rdtsc();
}
printf("t2 = %d\n", t2);
O2();
var = 1;
}
However, t2
gets printed on the console much after t1
does. I guess this suggests that rdtsc
has looped over to 0 in f2()
and doesn't result in a synchronized execution of O1()
and O2()
. Thread barriers didn't offer the granularity of synchronization I require.