0

Program utilizes threading to count to 1000000000. There are no mutex locks and such since every thread accesses the same global variable and simply increases it.

// This is just a function pointer used later. Just for increasing the 
// readability on following code.
typedef void (*fptr)(int);

void* count_to(void*);
double stopwatch(fptr, int);

// Number to sum to
const long long MAX_SUM = 1000000000;
// Global sum that threads accesses
long long sum = 0;

// Main function creates threads from 1 to 11 and count the time
// in between executions using `stopwatch` function. 
int
main() {
    for (int i = 1; i < 12; ++i) {
        printf("\n======================================\n");
        printf("Time it took when thread count is %d : %f",
                i, stopwatch(create_and_execute, i));
        printf("\n======================================\n");
    }
    return EXIT_SUCCESS;
}

// Takes a function pointer and thread count
double
stopwatch(fptr func, int count) {
    clock_t begin = clock();
    func(count);
    clock_t end = clock();
    return (double)(end - begin) / CLOCKS_PER_SEC;
}

void
create_and_execute(count) {
    pthread_t* threads = (pthread_t*) malloc(sizeof(pthread_t) * count);

    if (threads == NULL) {
        fprintf(stderr, "Out of memory\n");
        exit(EXIT_FAILURE);
    }

    for (int i = 0; i < count; ++i) {
        if (pthread_create(&threads[i], NULL, count_to, NULL) != 0){
                fprintf(stderr, "Can't create thread %d\n", i);
                exit(EXIT_FAILURE);
        }
    }

    for (int i = 0; i < count; i++) {
        pthread_join(threads[i], NULL);
    }
    printf("Threads finished their job. Sum is: %lld\n", sum );
    sum = 0;

    free(threads);
}
// counts to MAX_SUM
void*
count_to(void* i) {
    while(sum < MAX_SUM) {
        sum += 1;
    }
    pthread_exit(0);
}

Outputs

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 1 : 2.044165
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 2 : 5.137091
======================================

======================================
Threads finished their job. Sum is: 1000000001
Time it took when thread count is 3 : 7.673170
======================================

======================================
Threads finished their job. Sum is: 1000000001
Time it took when thread count is 4 : 20.567974
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 5 : 25.316267
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 6 : 14.618212
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 7 : 34.998254
======================================

======================================
Threads finished their job. Sum is: 1000000002
Time it took when thread count is 8 : 40.366402
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 9 : 38.578370
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 10 : 30.201249
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 11 : 42.017907
======================================

The problem is that, as you can see, increasing thread count just increases the execution time. With some exceptions. I was expecting it to be decrease the execution time greatly with increased thread count since threads work in parallel. I do understand that increasing the thread count after some thread count would just stay around the same as per law, as one can observe here too. But why is it that it increases instead of decreasing? Am I doing something wrong? Did I completely miss the point of threads? Any advice would be helpful.

To further illustrate my problem, my expected output would be, for example:

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 1 : 2.044165
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 2 : 1.137091
======================================

======================================
Threads finished their job. Sum is: 1000000001
Time it took when thread count is 3 : 1.073170
======================================

======================================
Threads finished their job. Sum is: 1000000001
Time it took when thread count is 4 : 1.007974
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 5 : 0.316267
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 6 : 0.618212
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 7 : 0.998254
======================================

======================================
Threads finished their job. Sum is: 1000000002
Time it took when thread count is 8 : 0.366402
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 9 : 0.578370
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 10 : 0.201249
======================================

======================================
Threads finished their job. Sum is: 1000000000
Time it took when thread count is 11 : 0.017907
======================================

If you think something is unclear, let me know!

PS: I originally asked this in codereview but someone suggested it fits better here. So, here I am.

user4709059
  • 125
  • 1
  • 8
  • The problem with `sum += 1` is that thread X might read the `sum` and get the value 100. Then thread X gets preempted, and the other threads increase `sum` to 10000000. When thread X gets to run again, it adds 1 to 100, and writes 101 to `sum`. – user3386109 May 31 '18 at 22:28
  • Also note that Amdahl's law depends on the number of *processors*, not the number of *threads*. It does you no good to have more threads than processors. On the contrary, it actually wastes time to have more threads than processors. – user3386109 May 31 '18 at 22:34
  • 1
    without any thread synchronization/mutex protection, you have a data race and the output is not determined. To understand why, look at the assembly generated from `sum += 1;`. It will be a few instructions (even more with the loop). The scheduler can and will switch which thread is executing at any time, even in the midst of incrementing `sum`. You can see the results of this when your output is not 1000000000. – yano May 31 '18 at 23:03
  • You did completely miss the point of threads with this exercise. Threads work on the same memory space, and anytime they access the same memory, that access must be synchronized, or you introduce data races. If you synchronize access to `sum` here, then using threads will actually slow you down since they'll have to wait on each other,,, there are context switches involved. Simply incrementing a value is a serial task. To reap the benefits of threads, you need work that can truly operate in parallel. – yano May 31 '18 at 23:05
  • 1
    `clock()` measures the total CPU time consumed by all threads, not wallclock time. See the question I have linked this as a duplicate of. The other comments here referring to the unsynchronized access to `sum` are also correct. – caf Jun 01 '18 at 00:30

0 Answers0