GCC 4.8, 5.1, 6.2 and Clang 3.8.1 on Ubuntu 16.10 with -std=c11
, -std=c++11
, -std=c++14
, and -std=c++17
all exhibit this weird behaviour when using fgetws(buf, (int) bufsize, stdin)
after setlocale(LC_ALL, "any_THING.utf8");
.
Example program:
#include <locale.h>
#include <wchar.h>
#include <stdlib.h>
#include <stdio.h>
int main(const int argc, const char* const * const argv) {
(void) argc;
setlocale(LC_ALL, argv[1]);
const size_t len = 3;
wchar_t *buf = (wchar_t *) malloc(sizeof (wchar_t) * len),
*stat = fgetws(buf, (int) len, stdin);
wprintf(L"[%ls], [%ls]\n", stat, buf);
free(buf);
return EXIT_SUCCESS;
}
Casting malloc
is just for C++-compat.
Compile it like this: cc -std=c11 fg.c -o fg
.
Run it with argv[1] = "C"
and echo 10 bytes to STDIN under Valgrind and we find...
$ python3 -c 'print("5" * 10)' | \
valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all ./f C
==1775== Memcheck, a memory error detector
==1775== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1775== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==1775== Command: ./f C
==1775==
[55], [55]
==1775==
==1775== HEAP SUMMARY:
==1775== in use at exit: 0 bytes in 0 blocks
==1775== total heap usage: 5 allocs, 5 frees, 25,612 bytes allocated
==1775==
==1775== All heap blocks were freed -- no leaks are possible
==1775==
==1775== For counts of detected and suppressed errors, rerun with: -v
==1775== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The program works perfectly, and there are no memory errors.
If it is run with a UTF-8 locale as argv[1]
, then we get the right output, but a memory error at 0x18
and a fatal segmentation fault.
$ python3 -c 'print("5" * 10)' | \
valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all ./f en_US.utf8
==1934== Memcheck, a memory error detector
==1934== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1934== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==1934== Command: ./f en_US.utf8
==1934==
[55], [55]
==1934== Invalid read of size 8
==1934== at 0x4EAF575: _IO_wfile_sync (wfileops.c:534)
==1934== by 0x4EB6DB1: _IO_default_setbuf (genops.c:523)
==1934== by 0x4EB2FC8: _IO_file_setbuf@@GLIBC_2.2.5 (fileops.c:459)
==1934== by 0x4EB79B5: _IO_unbuffer_all (genops.c:921)
==1934== by 0x4EB79B5: _IO_cleanup (genops.c:966)
==1934== by 0x4E73282: __run_exit_handlers (exit.c:96)
==1934== by 0x4E73339: exit (exit.c:105)
==1934== by 0x4E593F7: (below main) (libc-start.c:325)
==1934== Address 0x18 is not stack'd, malloc'd or (recently) free'd
==1934==
==1934==
==1934== Process terminating with default action of signal 11 (SIGSEGV)
==1934== Access not within mapped region at address 0x18
==1934== at 0x4EAF575: _IO_wfile_sync (wfileops.c:534)
==1934== by 0x4EB6DB1: _IO_default_setbuf (genops.c:523)
==1934== by 0x4EB2FC8: _IO_file_setbuf@@GLIBC_2.2.5 (fileops.c:459)
==1934== by 0x4EB79B5: _IO_unbuffer_all (genops.c:921)
==1934== by 0x4EB79B5: _IO_cleanup (genops.c:966)
==1934== by 0x4E73282: __run_exit_handlers (exit.c:96)
==1934== by 0x4E73339: exit (exit.c:105)
==1934== by 0x4E593F7: (below main) (libc-start.c:325)
==1934== If you believe this happened as a result of a stack
==1934== overflow in your program's main thread (unlikely but
==1934== possible), you can try to increase the size of the
==1934== main thread stack using the --main-stacksize= flag.
==1934== The main thread stack size used in this run was 8388608.
==1934==
==1934== Process terminating with default action of signal 11 (SIGSEGV)
==1934== Access not within mapped region at address 0x18
==1934== at 0x4EAF575: _IO_wfile_sync (wfileops.c:534)
==1934== by 0x4EB6DB1: _IO_default_setbuf (genops.c:523)
==1934== by 0x4EB2FC8: _IO_file_setbuf@@GLIBC_2.2.5 (fileops.c:459)
==1934== by 0x4EB79B5: _IO_unbuffer_all (genops.c:921)
==1934== by 0x4EB79B5: _IO_cleanup (genops.c:966)
==1934== by 0x4FAA93B: __libc_freeres (in /lib/x86_64-linux-gnu/libc-2.24.so)
==1934== by 0x4A276EC: _vgnU_freeres (vg_preloaded.c:77)
==1934== by 0x1101: ???
==1934== by 0x3805234F: ??? (mc_malloc_wrappers.c:483)
==1934== by 0x51FA8BF: ??? (in /lib/x86_64-linux-gnu/libc-2.24.so)
==1934== If you believe this happened as a result of a stack
==1934== overflow in your program's main thread (unlikely but
==1934== possible), you can try to increase the size of the
==1934== main thread stack using the --main-stacksize= flag.
==1934== The main thread stack size used in this run was 8388608.
==1934==
==1934== HEAP SUMMARY:
==1934== in use at exit: 35,007 bytes in 149 blocks
==1934== total heap usage: 233 allocs, 84 frees, 46,936 bytes allocated
==1934==
==1934== 11 bytes in 1 blocks are still reachable in loss record 1 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6396B: new_composite_name (setlocale.c:167)
==1934== by 0x4E63F91: setlocale (setlocale.c:378)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 32 bytes in 1 blocks are still reachable in loss record 2 of 24
==1934== at 0x4C2EB55: calloc (vg_replace_malloc.c:711)
==1934== by 0x4EF288B: __wcsmbs_load_conv (wcsmbsload.c:168)
==1934== by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75)
==1934== by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223)
==1934== by 0x4EAFC58: _IO_fwide (iofwide.c:124)
==1934== by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58)
==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 42 bytes in 1 blocks are still reachable in loss record 3 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 50 bytes in 1 blocks are still reachable in loss record 4 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 56 bytes in 1 blocks are still reachable in loss record 5 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 92 bytes in 2 blocks are still reachable in loss record 6 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 104 bytes in 1 blocks are still reachable in loss record 7 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 132 bytes in 12 blocks are still reachable in loss record 8 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4EC5C49: strndup (strndup.c:43)
==1934== by 0x4E64AB4: _nl_find_locale (findlocale.c:315)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 132 bytes in 12 blocks are still reachable in loss record 9 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4EC5BF9: strdup (strdup.c:42)
==1934== by 0x4E63BCE: setlocale (setlocale.c:369)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 144 bytes in 2 blocks are still reachable in loss record 10 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 208 bytes in 1 blocks are still reachable in loss record 11 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E631C9: __gconv_lookup_cache (gconv_cache.c:372)
==1934== by 0x4E5B34B: __gconv_find_transform (gconv_db.c:752)
==1934== by 0x4EF296A: __wcsmbs_getfct (wcsmbsload.c:91)
==1934== by 0x4EF296A: __wcsmbs_load_conv (wcsmbsload.c:186)
==1934== by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75)
==1934== by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223)
==1934== by 0x4EAFC58: _IO_fwide (iofwide.c:124)
==1934== by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58)
==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 208 bytes in 1 blocks are still reachable in loss record 12 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E630EB: __gconv_lookup_cache (gconv_cache.c:372)
==1934== by 0x4E5B34B: __gconv_find_transform (gconv_db.c:752)
==1934== by 0x4EF2A0D: __wcsmbs_getfct (wcsmbsload.c:91)
==1934== by 0x4EF2A0D: __wcsmbs_load_conv (wcsmbsload.c:189)
==1934== by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75)
==1934== by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223)
==1934== by 0x4EAFC58: _IO_fwide (iofwide.c:124)
==1934== by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58)
==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 365 bytes in 12 blocks are still reachable in loss record 13 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 461 bytes in 12 blocks are still reachable in loss record 14 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 672 bytes in 12 blocks are still reachable in loss record 15 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 826 bytes in 24 blocks are still reachable in loss record 16 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 1,024 bytes in 1 blocks are still reachable in loss record 17 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4EA7381: _IO_file_doallocate (filedoalloc.c:101)
==1934== by 0x4EA890C: _IO_wfile_doallocate (wfiledoalloc.c:70)
==1934== by 0x4EAD159: _IO_wdoallocbuf (wgenops.c:390)
==1934== by 0x4EAF39C: _IO_wfile_overflow (wfileops.c:441)
==1934== by 0x4EACA12: __woverflow (wgenops.c:226)
==1934== by 0x4EACA12: _IO_wdefault_xsputn (wgenops.c:331)
==1934== by 0x4EAF7A0: _IO_wfile_xsputn (wfileops.c:1033)
==1934== by 0x4E925EB: vfwprintf (vfprintf.c:1320)
==1934== by 0x4EABA98: wprintf (wprintf.c:32)
==1934== by 0x10885D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 1,248 bytes in 12 blocks are still reachable in loss record 18 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 1,600 bytes in 1 blocks are still reachable in loss record 19 of 24
==1934== at 0x4C2CA6F: malloc (vg_replace_malloc.c:298)
==1934== by 0x4C2EDEF: realloc (vg_replace_malloc.c:785)
==1934== by 0x4E6B692: extend_alias_table (localealias.c:397)
==1934== by 0x4E6B692: read_alias_file (localealias.c:319)
==1934== by 0x4E6B8B0: _nl_expand_alias (localealias.c:203)
==1934== by 0x4E648D7: _nl_find_locale (findlocale.c:161)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 1,728 bytes in 24 blocks are still reachable in loss record 20 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 2,048 bytes in 1 blocks are still reachable in loss record 21 of 24
==1934== at 0x4C2ED5F: realloc (vg_replace_malloc.c:785)
==1934== by 0x4E6B61C: read_alias_file (localealias.c:331)
==1934== by 0x4E6B8B0: _nl_expand_alias (localealias.c:203)
==1934== by 0x4E648D7: _nl_find_locale (findlocale.c:161)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 3,344 bytes in 12 blocks are still reachable in loss record 22 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4E64F09: _nl_intern_locale_data (loadlocale.c:95)
==1934== by 0x4E64F09: _nl_load_locale (loadlocale.c:266)
==1934== by 0x4E649B9: _nl_find_locale (findlocale.c:234)
==1934== by 0x4E63B7B: setlocale (setlocale.c:340)
==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 4,096 bytes in 1 blocks are still reachable in loss record 23 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4EA7381: _IO_file_doallocate (filedoalloc.c:101)
==1934== by 0x4EA890C: _IO_wfile_doallocate (wfiledoalloc.c:70)
==1934== by 0x4EB6875: _IO_doallocbuf (genops.c:398)
==1934== by 0x4EAE493: _IO_wfile_underflow (wfileops.c:197)
==1934== by 0x4EAC431: _IO_wdefault_uflow (wgenops.c:213)
==1934== by 0x4EAB0E5: _IO_getwline_info (iogetwline.c:65)
==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== 16,384 bytes in 1 blocks are still reachable in loss record 24 of 24
==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934== by 0x4EA88D8: _IO_wfile_doallocate (wfiledoalloc.c:79)
==1934== by 0x4EB6875: _IO_doallocbuf (genops.c:398)
==1934== by 0x4EAE493: _IO_wfile_underflow (wfileops.c:197)
==1934== by 0x4EAC431: _IO_wdefault_uflow (wgenops.c:213)
==1934== by 0x4EAB0E5: _IO_getwline_info (iogetwline.c:65)
==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934==
==1934== LEAK SUMMARY:
==1934== definitely lost: 0 bytes in 0 blocks
==1934== indirectly lost: 0 bytes in 0 blocks
==1934== possibly lost: 0 bytes in 0 blocks
==1934== still reachable: 35,007 bytes in 149 blocks
==1934== suppressed: 0 bytes in 0 blocks
==1934==
==1934== For counts of detected and suppressed errors, rerun with: -v
==1934== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 0 from 0)
My question boils down to: is this a bug in libc6
or libstdc++6
? Or does fgetws
after setting a UTF-8 locale exhibit some sort of undefined behaviour (according to glibc docs or the C standard), or is my code somehow wrong?
Note that by Valgrind's stack trace it seems like it may be a bug in Valgrind, but the program segfaults when not run under Valgrind or when run with AddressSanitizer (libasan
) instead.