Linux process ID and thread ID

Question

Suppose we have many user processes running on Linux. Each process has many threads running.

I can get process ID by calling getpid(), the return value of which is an integer.

I can get thread ID by calling pthread_self(), the return value of which is an opaque type called pthread_t.

Now I need to store the process ID (an int, typically 4 bytes) and the thread ID (pthread_t, need to figure out how many bytes) in shared memory so that I can later use the two pieces of ID information to identify that specific thread and to check if the thread is still running or not.

I've found many online sources cast pthread_t to either unsigned int or unsigned long. Since I don't want any data loss during the casting, how should I deal with the pthread_t data so that it's a fixed-size piece of data (as mentioned, I need to store the thread information in shared memory).

Also, how should I identify that specific thread by combination of process ID and thread ID later? How to check if the thread is still running or not?

Process IDs and thread IDs can be reused, so I believe your entire approach is flawed... — Nemo, Dec 19 '11 at 18:50
@Nemo You are right. Let's assume different processes and threads always have different IDs to separate the concerns. — Terry Li, Dec 19 '11 at 18:52

score 4 · Accepted Answer · answered Dec 19 '11 at 18:53

4

If you want to store pid_t and pthread_t anywhere, you should use their respective types (i.e. "pid_t" and "pthread_t"). So if you want to store them in shared memory somewhere, do a memcpy() to get them there.

As far as identifying specific threads by combinations of PID and TID, see Nemo's comment.

If you do make the assumption that they will exist, you can have your program look at /proc to find the appropriate pid's directory, and looking in /proc/<pid>/task for the threads.

answered Dec 19 '11 at 18:53

Dan Fego

13,644
6
48
59

They're opaque; you shouldn't try to use them in any way other than "some block I don't know the specifics of," since their implementation could change at any time. – Dan Fego Dec 19 '11 at 18:56
So memcpy() can figure out how many bytes to copy for me? Then what I need to do is reserve a block of memory that is big enough to hold any pthread_t type? By the way, checking /proc//task is the only way to get the status of that specific thread? – Terry Li Dec 19 '11 at 19:25
`memcpy(dest, src, sizeof(pthread_t))` should be able to copy any `pthread_t`, and the same for `pid_t`. As far as using the proc filesystem, I do believe that's the Linux way to do it. I'm not sure if there are any other (or better) ways. I would recommend using a struct if you want them togeter, as Matteo Italia suggested. – Dan Fego Dec 19 '11 at 19:31
I'd like to reserve the same amount of memory for each process and thread ID pair. On the same system, would sizeof(pthread_t) and sizeof(pid_t) always give us the same size? – Terry Li Dec 19 '11 at 19:37
@TerryLiYifeng: Yes, they should always yield the same size (to themselves, not necessarily to each other). sizeof() values are determined at compile-time, and `pthread_t` and `pid_t` are just types like any other, with a fixed size for a given system. – Dan Fego Dec 19 '11 at 19:41
Types in C are a statical thing: sizeof is evaluated at compile time, so it can't give different results on the same platform. By the way, any change in the size of `pthread_t` is an ABI breaking change, which means that all the applications that use pthread have to be recompiled (=it surely doesn't change behind your back). – Matteo Italia Dec 19 '11 at 19:46
you don't even need `memcpy` for this. Just casting the shared address to `pthread_t*` and dereferencing it should do, something like `*(pthread_t)where = pthread_self()`. No need for `sizeof`. – Jens Gustedt Dec 19 '11 at 21:09
Threads also exist at /proc/, they are just not shown in /proc (readdir) by default. – jørgensen Dec 19 '11 at 21:17
Suppose threads of process 24532 are running. I check /proc/24532/task, only to find tasks identified by 24532, 245333, 24534, etc. I assume these are all thread IDs, however, they are so different from thread IDs yielded by calling pthread_self(). The return value of the call is something like 139777191794432. I'm a little confused. – Terry Li Dec 21 '11 at 15:51
@TerryLiYifeng: You've got to avoid looking at pthread_t as an integer, because it's not (necessarily) one. It's an opaque type that only the system is supposed to know the structure of and how to operate on. So the number you get is meaningless as a number; it's likely a pointer or a struct or something. – Dan Fego Dec 21 '11 at 16:01
Then what are possible ways to identify specific threads in that directory? Are there any system routines for that purpose? If not, how should we compare pthread_t values to identify the thread? – Terry Li Dec 21 '11 at 16:06
@TerryLiYifeng: At least on Linux, the man page for pthreads(7) states: "Calls to getpid(2) return a different value in each thread." So you can use getpid() for this purpose. – Dan Fego Dec 21 '11 at 16:07
That's most strange that all threads in a single process always get the same value by calling getpid(). – Terry Li Dec 21 '11 at 16:10
@TerryLiYifeng: On Linux, IIRC, threads are implemented as actual processes, so it makes at least some sense. :) – Dan Fego Dec 21 '11 at 16:12

score 3 · Answer 2 · edited May 23 '17 at 11:48

3

You can use pthread_join as a crude way of detecting completion, but I am sure that is not what you want. Instead you must handle this yourself by creating a thread complete flag. A nice way of setting this flag is in the pthread cleanup handlers. See this related post

edited May 23 '17 at 11:48

Community

1
1

answered Dec 19 '11 at 18:57

Chris Mansley

782
5
12

score 2 · Answer 3 · answered Dec 19 '11 at 18:54

2

Why don't you just pack them in a struct?

typedef struct
{
    int procID;
    pthread_t threadID;

} ProcThreadID;

without worrying about the specific underlying type of pthread_t (after all we are in C, so everything is POD and can be copied blindly with memcpy).

You can get its size easily using the sizeof operator:

size_t ptIDSize = sizeof(ProcThreadID);

and you can copy it wherever you want with a simple memcpy.

answered Dec 19 '11 at 18:54

Matteo Italia

123,740
17
206
299

I have to serialise pthread_t data into shared memory, so I'm not sure how it would work out for me. – Terry Li Dec 19 '11 at 18:57
Since `pthread_t` is some `typedef` for a C data type it is copyiable with a `memcpy` *by definition*. If it were C++ (where objects can have copy constructors, overloaded assignment operators, vtable pointers and other hidden stuff) you should be careful, but here there's no danger at all. – Matteo Italia Dec 19 '11 at 19:00
I'm gonna use this. By the way, any idea how I can identify a specific thread by its process ID and thread ID and check if it's still running? – Terry Li Dec 19 '11 at 19:30

score 0 · Answer 4 · edited Oct 14 '21 at 14:28

command to get thread ids running in a process

$ ps -eLf | grep 14965 
UID        PID              PPID     LWP           C  NLWP STIME TTY      TIME     CMD 
root       14965            14732    14965         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14966         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14967         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14968         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14969         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14970         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14971         0  201  15:28 pts/10   00:00:00 ./a.out 
root       14965            14732    14972         0  201  15:28 pts/10   00:00:00 ./a.out

Here the 4th column (LWP) shows all the threads running in process with ID 14965

Linux process ID and thread ID

4 Answers4