1

I am trying to create a multithreading library in c. Here is the link to whole project (because pasting the code here would be too much text).

In the file tests/MultithreadingTests.c I am testing to the functionality of lib/systems/multithreading/src/ThreadPool.c. The function add_work adds any routine function the the work queue which utilises the functionality of lib/sds/lists/src/Queue.c and lib/sds/lists/src/LinkedList.c. In MultithreadingTests.c, NUM_TESTS defines the number of jobs I am adding to the work queue to be performed by NUM_THREADS

I am facing a weird issue with the code. If NUM_TESTS any number is less than 349,261, the code works perfectly fine but any number higher than or equal to 349,261 results in segmentation fault. I tried to check where exactly the segmentation fault is happening and found that it happens in the lib/sds/lists/src/Node.c at line number 29 at memcpy(node->data, data, size);

The flow of code for the error is

  • tests/MultiThreadingTests.c line 95 at pool->add_work(pool, new_thread_job(routine, &arguments[i]));
  • lib/systems/multithreading/src/ThreadPool.c line 150 thread_pool->work.push(&thread_pool->work, &job, sizeof(job));
  • lib/sds/lists/src/Queue.c line 54 return q->list.insert(&q->list, q->list.length, data, size);
  • lib/sds/lists/src/LinkedLists.c line 107 Node *node_to_insert = new_node(data, size);
  • lib/sds/lists/src/Node.c line 29 memcpy(node->data, data, size);

I am not sure why this issue is happening only when the number of jobs is higher than or equal to 349,261 but not when its smaller.

Rusty
  • 1,086
  • 2
  • 13
  • 27
  • Probably `malloc()` is [being lazy](https://stackoverflow.com/questions/712683/what-is-lazy-allocation) and returning some thing other than `NULL` (you are checking `malloc()`'s return value, right?) even when there is no memory. – pmg Mar 06 '22 at 14:49
  • Run your code through valgrind. If you're mismanaging memory it will tell you where. – dbush Mar 06 '22 at 14:53
  • I advise you to check error codes first. Without that, do not be surprised to see silent weird bugs. They might be a race condition somewhere. Is the bug deterministic? – Jérôme Richard Mar 06 '22 at 14:57
  • @JérômeRichard The bug is deterministic. Also it doesn't occur while adding the 349,261st job in the work queue but while adding the very first job. – Rusty Mar 06 '22 at 15:21
  • I updated one typo in the question. Issue is happening when `number of jobs` is higher than or equal to 349,261 and not `number of threads`. – Rusty Mar 06 '22 at 15:30
  • @dbush I just installed valgrind and tried to run it though valgrind. When I run the executable directly `./build/testsmultithreading_tests` it creates segmentation fault but while running it through valgrind `valgrind ./build/testsmultithreading_tests` makes the program work perfectly without any errors. – Rusty Mar 06 '22 at 15:35

1 Answers1

1

In function new_thread_pool(), you neither

  • test for allocation failure in thread_pool.pool = malloc(sizeof(pthread_t) * num_threads); nor
  • test for thread creation failure in pthread_create(&thread_pool.pool[i], NULL, generic_thread_function, &thread_pool);

Trying to create 349261 or more threads on any system looks more like a stress test than a real life purpose. Test for errors and report them in a usable way.

new_node does not check for allocation failure either. Unless you instrument your code for this, you should use a wrapper around malloc() calls to detect allocation failure and abort the program with an error message.

The issue in your code is in the function mt_test_add_work(): you define an array of arguments with automatic storage:

Arguments arguments[NUM_TESTS];

This object is allocated on the stack, using 8382264 bytes of stack space. This is too much for your system and causes undefined behavior down the call chain where further stack usage actually cause a segmentation fault: a typical case of Stack Overflow.

You should allocate this object from the heap and free it before exiting the function:

Arguments *arguments = malloc(sizeof(*arguments) * NUM_TESTS);
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Fair suggestion on the wrappers. Let me implement those. But I checked both of the those places and it doesn't seem to be the issue. Also - 349261 is not the number of threads but number of jobs in case that caused any confusion. Number of threads is only 10 (NUM_THREADS). – Rusty Mar 06 '22 at 15:14
  • @Rusty: You write *If `NUM_THEADS` any number is less than 349,261* which does cause confusion :) – chqrlie Mar 06 '22 at 15:26
  • My bad. Correcting it. – Rusty Mar 06 '22 at 15:27
  • 1
    OK I found the problem. You are in the right place on stackoverflow :) – chqrlie Mar 06 '22 at 16:05