When does the routine passed to pthread_create start?

Question

Given the following code

#include <pthread.h>

void *pt_routine(void *arg)
{
    pthread_t *tid;
    tid = (pthread_t *) arg;
    /* do something with tid , say printf?*/
    /*
    printf("The thread ID is %lu\n", *tid);
    */
    return NULL;
}

int main(int argc, char **argv)
{
    int rc;
    pthread_t tid;
    rc = pthread_create(&tid, NULL, pt_routine, &tid);
    if (rc)
    {
        return 1;
    }
    printf("The new thread is %lu\n", tid);
    pthread_join(tid, NULL);
    return 0;
}

Can the routine ALWAYS get the right tid?

Of course I could use pthread to fetch self ID but I just wonder when does the routine run.

"When does the routine passed to pthread_create start?", after the call to `pthread_create()` ? Your question is unclear. "Can the routine ALWAYS get the right tid?" what do you mean ? — Stargateur, Jun 23 '17 at 04:45
You're passing `&tid` as both the first argument of `pthread_create` (the place where `pthread_create` should store the new thread's ID) and the fourth argument of `pthread_create` (an argument to be passed into `pt_routine`). It *sounds* like you're asking whether it's possible for `pt_routine` to run and dereference its `arg` before `pthread_create` has actually stored the thread's ID at that address. Is that correct? — Wyzard, Jun 23 '17 at 04:49
Possible duplicate of [pthread execution on linux](https://stackoverflow.com/questions/4991470/pthread-execution-on-linux), [Pthread Run a thread right after it's creation](https://stackoverflow.com/q/12536649/608639), etc. — jww, Aug 14 '19 at 12:55

Antti Haapala -- Слава Україні · Accepted Answer · 2017-06-23T05:17:55.470

Well, there are actually 2 questions:

which thread will execute first
will the thread id be saved before the new thread starts.

This answer concerns Linux, as I don't have any other platforms available. The answer to the first question can be found in the manuals:

Unless real-time scheduling policies are being employed, after a call to pthread_create(), it is indeterminate which thread—the caller or the new thread—will next execute.

So it is clear that in your case, it is indeterminate which thread will actually run first. Now, another question is how is pthread_create implemented - if it could somehow create a dormant thread, storing its id first, and then later starting it?

Well, linux creates the new thread using the clone system call:

clone(child_stack=0x7f7b35031ff0, 
      flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM
          |CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
      parent_tidptr=0x7f7b350329d0,
      tls=0x7f7b35032700,
      child_tidptr=0x7f7b350329d0) = 24009

Now, it seems that the thread id is stored with a pointer from the clone call, but it seems clear that child_tidptr doesn't refer to the address of tid, as if I print it, the the address is different; this is some internal variable within the pthread library; and tid would be updated after the clone system call returns in the parent thread.

And indeed, pthread_self says the following:

The thread ID returned by pthread_self() is not the same thing as the kernel thread ID returned by a call to gettid(2).

This confirms that the kernel thread ids are distinct from pthread_ts

Thus, in addition to this not being supported by the POSIX spec, there is no such guarantee on the Linux platform in practice - the tid will need to be set in the parent thread after clone returns, otherwise the parent wouldn't immediately know the thread id of the child - but this also means that if the child is the first to execute after the return, then the thread id might not be set there yet.

score 2 · Answer 2 · answered Jun 23 '17 at 05:00

pt_thread() will begin execution at some arbitrary point after pthread_create() is called - and that includes that it might start running before pthread_create() returns to the calling code. And there is no guarantee made that the pthread_create() implementation will update the tid variable before the thread starts execution.

So there is nothing in your code that ensures that pt_routine() will read the tid value properly. You would need to use some sort of synchronization to ensure that occurs properly without a data race. Or you could have the thread call pthread_self().

See the "Application Usage" section of the POSIX spec for pthread_create():

There is no requirement on the implementation that the ID of the created thread be available before the newly created thread starts executing. The calling thread can obtain the ID of the created thread through the return value of the pthread_create() function, and the newly created thread can obtain its ID by a call to pthread_self

Another [quote](http://man7.org/linux/man-pages/man3/pthread_create.3.html) from linux man page: "See pthread_self(3) for further information on the thread ID returned in *thread by pthread_create(). Unless real-time scheduling policies are being employed, after a call to pthread_create(), it is indeterminate which thread—the caller or the new thread—will next execute." — Stargateur, Jun 23 '17 at 05:02

When does the routine passed to pthread_create start?

2 Answers2