3

I'm reading Kerrisk's book and see that the following as a note,

Caution is required when using a cast integer as the return value of a thread’s start function. The reason for this is that PTHREAD_CANCELED, the value returned when a thread is canceled (see Chapter 32), is usually some implementation-defined integer value cast to void *. If a thread’s start function returns the same integer value, then, to another thread that is doing a pthread_join(), it will wrongly appear that the thread was canceled. In an application that employs thread cancellation and chooses to return cast integer values from a thread’s start functions, we must ensure that a normally terminating thread does not return an integer whose value matches PTHREAD_CANCELED on that Pthreads implementation. A portable application would need to ensure that normally terminating threads don’t return integer values that match PTHREAD_CANCELED on any of the implementations on which the application is to run.

I don't understand importance of the note. Could you codify(show its simple code snippet) it simply to illustrate? What is the issue in th(ese)is case(s)?

1 Answers1

1

This is a typical definition of PTHREAD_CANCELED (quoted verbatim from /usr/include/pthread.h on the machine where I'm typing this, which runs Linux with GNU libc):

#define PTHREAD_CANCELED ((void *) -1)

So if you have code like this to check for cancellation:

void *thread_result;
int rv = pthread_join(child, &thread_result);
if (rv)
    error_exit("pthread_join failed", rv);
if (thread_result == PTHREAD_CANCELED)
    error_exit("thread canceled", 0);

you must not also have a thread procedure like this:

static void *appears_to_be_canceled(void *unused)
{
    return ((void *) -1);
}

because PTHREAD_CANCELED and ((void *) -1) are equal. Note that the number is not guaranteed to be −1, it could differ from system to system, and there's no good way to find out what it is at compile time because ((void *)...) isn't usable in an #if expression.

There are two good ways to avoid this problem:

  • Don't use thread cancellation, so you don't have to check for PTHREAD_CANCELED and don't have to care what its numeric value is. This is a good idea for several other reasons, most importantly that cancellation makes it even harder to write robust multithreaded code than it already is.
  • Return only valid pointers from your thread procedures, not numbers. A good idiom to follow is like this:

    struct worker_data
    {
       // put _everything_ your thread needs to access in here
    };
    static void *worker_proc (void *data_)
    {
       struct worker_data *data = data_;
       // do stuff with `data` here 
       return data_;
    }
    

    Returning the worker_data object means the code that calls pthread_join doesn't have to track which worker_data object corresponds to which pthread_t. And it also means the return value of a successfully completed thread is guaranteed not to be equal to PTHREAD_CANCELED, because PTHREAD_CANCELED is guaranteed not to compare equal to any valid pointer.

zwol
  • 135,547
  • 38
  • 252
  • 361