3

I have this code below from a book, compiled on Ubuntu 18.4, with Eclipse -O0 -g3

#include <pthread.h>
#include <stdio.h>

void *myThread( void *arg )
{
  printf( "Thread %d started\n", (int)arg );

  pthread_exit( arg );
}

#define MAX_THREADS 5

int main()
{
  volatile int  i;
  int status;
  int ret;
  pthread_t threadIds[MAX_THREADS];

  for (i = 0 ; i < MAX_THREADS ; i++) {
    ret = pthread_create( &threadIds[i], NULL, myThread, (void *)i );
    if (ret != 0) {
      printf( "Error creating thread %d\n", (int)threadIds[i] );
    }
  }

  for (i = 0 ; i < MAX_THREADS ; i++) {
    printf("-----------  i = %d\n",i);
    ret = pthread_join( threadIds[i], (void **)&status );
    if (ret != 0) {
      printf( "Error joining thread %d\n", (int)threadIds[i] );
    } else {
      printf( "Status = %d\n", status );
    }
  }

  return 0;
}

This code worked correctly and produced the output

-----------  i = 0
Thread 3 started
Thread 0 started
Status = 0
-----------  i = 1
Thread 2 started
Thread 1 started
Status = 1
-----------  i = 2
Status = 2
-----------  i = 3
Status = 3
-----------  i = 4
Thread 4 started
Status = 4

However, if volatile keyword at variable i declaration is ignored, the following output is produced:

Thread 1 started
Thread 0 started
Thread 2 started
Thread 3 started
-----------  i = 0
Status = 0
-----------  i = 1
Status = 1
-----------  i = 1
Error joining thread -947054848
-----------  i = 2
Status = 2
-----------  i = 1
Error joining thread -947054848
-----------  i = 2
Error joining thread -955447552
-----------  i = 3
Status = 3
-----------  i = 1
Error joining thread -947054848
-----------  i = 2
Error joining thread -955447552
-----------  i = 3
Error joining thread -963840256
-----------  i = 4
Thread 4 started
Status = 4
-----------  i = 1
Error joining thread -947054848
-----------  i = 2
Error joining thread -955447552
-----------  i = 3
Error joining thread -963840256
-----------  i = 4
Error joining thread -1073744128

Please help me identify what the problem could be that led to this issue.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
user1502776
  • 553
  • 7
  • 17
  • 1
    @Alerra you are wrong. volatile does not guarantee coherency, lock atomicity of the access. Only says that the variable is side effects prone which forces read before use and store after change, nothing else – 0___________ Aug 13 '18 at 02:21
  • `(void *)i` , shouldn't that be `(void*)&i` ? – erik258 Aug 13 '18 at 02:28
  • 1
    On an error return from `pthread_join()`, it would be more informative to print the value or `ret` (an error number) than to print the thread ID. However, that's no an explanation of what went wrong. – Jonathan Leffler Aug 13 '18 at 02:46
  • @DanFarrell: Passing the address of `i` to `pthread_create()` would lead to all sorts of synchronization problems — the different threads would not necessarily see the value in `i` that was current at the time when `pthread_create()` is called. – Jonathan Leffler Aug 13 '18 at 02:47
  • If the code is indeed verbatim from a book, perhaps you should reveal its name to us -- *and* then burn it. – Antti Haapala -- Слава Україні Aug 13 '18 at 04:09

1 Answers1

4

In your code:

int status;

// ...

pthread_join( threadIds[i], (void **)&status );

this is a strict aliasing violation. The pthread_join writes a value of type void * to the location, however you provided the location of an int. This causes undefined behaviour, meaning anything can happen.

Possibly what happened on your system would be a write out of bounds (int = 4 bytes, void * = 8 bytes), scrambling the stack leading to a mess.

You can fix this with:

void *status;
// ...
pthread_join( threadIds[i], &status );

and later on, if you want to print status with %d, use (int)status for the argument.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • Not only a strict aliasing violation, but also possibly UB under C17 6.3.2.3 §5 and §6, in case `int` cannot represent a pointer or vice versa. Quite likely nowadays, with 64 bit computers. – Lundin Aug 13 '18 at 06:52
  • @Lundin In general yes, although OP specified gcc on Linux which I expect would define that operation (which is implementation-defined); and we would expect that a round trip `int` -> `void *` -> `int` works. The converse trip might fail obviously – M.M Aug 13 '18 at 06:55