Currently, the logic in glibc source of perror is such:
If stderr
is oriented, use it as is, else dup()
it and use perror()
on dup()
'ed fd
.
If stderr
is wide-oriented, the following logic from stdio-common/fxprintf.c is used:
size_t len = strlen (fmt) + 1;
wchar_t wfmt[len];
for (size_t i = 0; i < len; ++i)
{
assert (isascii (fmt[i]));
wfmt[i] = fmt[i];
}
res = __vfwprintf (fp, wfmt, ap);
The format string is converted to wide-character form by the following code, which I do not understand:
wfmt[i] = fmt[i];
Also, it uses isascii
assert:
assert (isascii(fmt[i]));
But the format string is not always ascii in wide-character programs, because we may use UTF-8 format string, which can contain non-7bit value(s). Why there is no assert warning when we run the following code (assuming UTF-8 locale and UTF-8 compiler encoding)?
#include <stdio.h>
#include <errno.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
setlocale(LC_CTYPE, "en_US.UTF-8");
fwide(stderr, 1);
errno = EINVAL;
perror("привет мир"); /* note, that the string is multibyte */
return 0;
}
$ ./a.out
привет мир: Invalid argument
Can we use dup()
on wide-oriented stderr
to make it not wide-oriented? In such case the code could be rewritten without using this mysterious conversion, taking into account the fact that perror()
takes only multibyte strings (const char *s) and locale messages are all multibyte anyway.
Turns out we can. The following code demonstrates this:
#include <stdio.h>
#include <wchar.h>
#include <unistd.h>
int main(void)
{
fwide(stdout,1);
FILE *fp;
int fd = -1;
if ((fd = fileno (stdout)) == -1) return 1;
if ((fd = dup (fd)) == -1) return 1;
if ((fp = fdopen (fd, "w+")) == NULL) return 1;
wprintf(L"stdout: %d, dup: %d\n", fwide(stdout, 0), fwide(fp, 0));
return 0;
}
$ ./a.out
stdout: 1, dup: 0
BTW, is it worth posting an issue about this improvement to glibc developers?
NOTE
Using dup()
is limited with respect to buffering. I wonder if it is considered in the implementation of perror()
in glibc. The following example demonstrates this issue.
The output is done not in the order of writing to the stream, but in the order in which the data in the buffer is written-off.
Note, that the order of values in the output is not the same as in the program, because the output of fprintf is written-off first (because of "\n"), and the output of fwprintf is written off when program exits.
#include <wchar.h>
#include <stdio.h>
#include <unistd.h>
int main(void)
{
wint_t wc = L'b';
fwprintf(stdout, L"%lc", wc);
/* --- */
FILE *fp;
int fd = -1;
if ((fd = fileno (stdout)) == -1) return 1;
if ((fd = dup (fd)) == -1) return 1;
if ((fp = fdopen (fd, "w+")) == NULL) return 1;
char c = 'h';
fprintf(fp, "%c\n", c);
return 0;
}
$ ./a.out
h
b
But if we use \n
in fwprintf, the output is the same as in the program:
$ ./a.out
b
h
perror()
manages to get away with that, because in GNU libc stderr
is unbuffered. But will it work safely in programs where stderr
is manually set to buffered mode?
This is the patch that I would propose to glibc developers:
diff -urN glibc-2.24.orig/stdio-common/perror.c glibc-2.24/stdio-common/perror.c
--- glibc-2.24.orig/stdio-common/perror.c 2016-08-02 09:01:36.000000000 +0700
+++ glibc-2.24/stdio-common/perror.c 2016-10-10 16:46:03.814756394 +0700
@@ -36,7 +36,7 @@
errstring = __strerror_r (errnum, buf, sizeof buf);
- (void) __fxprintf (fp, "%s%s%s\n", s, colon, errstring);
+ (void) _IO_fprintf (fp, "%s%s%s\n", s, colon, errstring);
}
@@ -55,7 +55,7 @@
of the stream. What is supposed to happen when the stream isn't
oriented yet? In this case we'll create a new stream which is
using the same underlying file descriptor. */
- if (__builtin_expect (_IO_fwide (stderr, 0) != 0, 1)
+ if (__builtin_expect (_IO_fwide (stderr, 0) < 0, 1)
|| (fd = __fileno (stderr)) == -1
|| (fd = __dup (fd)) == -1
|| (fp = fdopen (fd, "w+")) == NULL)