0

Here is my code:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>

pthread_t ntid;void
printids(const char *s) {
  printf("%s \n", s);
}

void *
thr_fn(void *arg)   {
  printids("new thread: ");
  return((void *)0);
}

int
main(void)  {
  pthread_create(&ntid, NULL, thr_fn, NULL);
  printids("main thread:");
}

I'm running it on Red Hat Enterprise Linux Workstation release 6.5 .
Here is my compiling command

gcc -ansi -g -std=c99 -Wall -DLINUX -D_GNU_SOURCE threadid.c -o threadid -pthread -lrt -lbsd

Here is the output:

main thread:
new thread:
new thread:

Why "new thread" has been printed twice? I doubt this may related to buffering mechanism in Linux. But after I added fflush(stdout) and fsync(1) in the end of each function. The output is almost the same.

If you run the program several times. The output differs:

main thread:
new thread:

or

main thread:
new thread:
new thread:

Or

main thread:

Yuan Wen
  • 1,583
  • 3
  • 20
  • 38
  • What do you mean with "almost the same"? If there are differences, what are they? – Some programmer dude Aug 14 '17 at 07:09
  • 1
    This doesn't compile. Please make it a [mcve]. I don't think the doubled output can be caused by the lines you provide here. –  Aug 14 '17 at 07:10
  • adding a declaration for `ntid` and a `pthread_join()` gives exactly the expected output, btw. –  Aug 14 '17 at 07:18
  • Cannot reproduce (after having added a `pthread_t ntid` definition). – Stephan Lechner Aug 14 '17 at 07:24
  • 1
    Did you actually try the code you posted here (which is at least complete now after the edit)? It gives the expected output, nothing is printed twice. It **might** not print `new thread:` **at all** as you don't do a `pthread_join()`, so the program could end before the new thread had a chance to run. But it's impossible this is printed twice. –  Aug 14 '17 at 07:25
  • You can run it multiple times. @StephanLechner – Yuan Wen Aug 14 '17 at 07:27
  • Gave it about 20 tries; at least no double output of `new thread:`. – Stephan Lechner Aug 14 '17 at 07:29
  • I can expect a scenario where __only__ main thread gets printed, because you don't wait for the created thread. But I cannot reproduce double new thread. – Ajay Brahmakshatriya Aug 14 '17 at 07:34
  • By the way, there's only one buffer *per process*. The worst that could happen is that the output from the main and child threads become intermingled, but you should not get duplication. Duplication similar to what you show can happen when creating a child *process* though, but should not happen with threads. – Some programmer dude Aug 14 '17 at 07:45
  • @Someprogrammerdude I believe you are referring to the buffer associated with `stdout` from the operating system, but the libc implementation could have a per thread output buffer. – Ajay Brahmakshatriya Aug 14 '17 at 07:46
  • @AjayBrahmakshatriya No I mean the C standard I/O buffering used by `printf`. And no matter if it's implemented using one buffer per process, or one per thread, this duplication should not happen. Especially if it's one buffer per thread. – Some programmer dude Aug 14 '17 at 07:49
  • @Someprogrammerdude what do you think about the scenario I mentioned in my answer below? I agree the case I mentioned would be a bug in the implementation of libc, but could be possible. – Ajay Brahmakshatriya Aug 14 '17 at 07:52
  • Can you post the output from `ldd threadid`? If you're linking in some library with a non-thread-safe `stdout` stream, that could explain your inconsistent results. – Andrew Henle Aug 14 '17 at 09:05
  • linux-vdso.so.1 => (0x00007fff6efff000) librt.so.1 => /lib64/librt.so.1 (0x0000003f77c00000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f77400000) libc.so.6 => /lib64/libc.so.6 (0x0000003f76c00000) /lib64/ld-linux-x86-64.so.2 (0x0000003f76400000) @AndrewHenle – Yuan Wen Aug 15 '17 at 01:55

2 Answers2

5

Most libc libraries do buffer the output as you mentioned. And at the end of the program (when the main thread exits), they flush all the buffers and exit.

There is a slight possibility that your new thread has flushed the output but before it could update the state of the buffer, the main program exited and the cleanup code flushed the same buffer again. Since these buffers are local to the thread I am sure they won't have concurrency mechanism. But because of this rare case it might get messed up.

You can try

err = pthread_create(&ntid, NULL, thr_fn, NULL);
printids("main thread:");
pthread_join(ntid, NULL);

At the end of the main function and check if the problem is solved.

This will cause your main function to wait till the new thread is finished (including the flushing operation it does).

Ajay Brahmakshatriya
  • 8,993
  • 3
  • 26
  • 49
  • Given Linux's conflation of threads and processes (they're similar, but different), I suspect that this is the cause of the trouble when the main thread exits without waiting for the child thread to complete. – Jonathan Leffler Aug 14 '17 at 07:41
  • @JonathanLeffler: I disagree. GLIBC does have internal locking for multithreading, as the various interfaces (including `fflush()`) are marked MT-Safe. I believe the underlying cause is the fact that `return` from `main()` in Linux immediately kills all threads (both that and `exit()` actually invoke the `exit_group()` syscall). Simply put, I don't believe the newly started thread(s) always get(s) to run before it is killed. – Nominal Animal Aug 14 '17 at 07:49
  • @NominalAnimal how can we explain the duplication? (if it indeed is happening, I cannot reproduce). – Ajay Brahmakshatriya Aug 14 '17 at 07:51
  • @NominalAnimal and I agree that the implementation of `fflush` must be having locks but on the side that it deals with the stdout fd. Are there locks for updating the per thread buffer states? – Ajay Brahmakshatriya Aug 14 '17 at 07:56
  • @AjayBrahmakshatriya: The locks are [per stream](https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofflush.c;hb=HEAD), not per thread, and the final flush at end of process has an [explicit lock](https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/genops.c;hb=HEAD). Unless the OP has a C library compiled for !_IO_MTSAFE_IO, or OP is using _unlocked() versions of the I/O functions, I don't see any way you could get a duplicate output using only one thread (in addition to the main one). – Nominal Animal Aug 14 '17 at 08:11
2

Double output is possible on glibc-based linux systems due to a nasty bug in glibc: if the FILE lock is already held at the time exit tries to flush, the lock is simply ignored and the buffer access is performed with no synchronization. This would be a great test case to report to glibc to pressure them to fix it, if you can reproduce it reliably.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 1
    Looks like there are even more threading bugs in glibc's exit handling: https://sourceware.org/bugzilla/show_bug.cgi?id=14333 – Andrew Henle Aug 14 '17 at 11:33