I'm working on a project which uses a structure to represent various different flavours of event. The structure has some common fields shared by all events (e.g. event type); and then has some event-specific fields which are represented by a union; something like this:
struct nested_event {
int type;
union {
struct {
char weeble;
char string[0];
} event1;
} d;
};
(This is of course a simplified example: in real life, we have other structures in the union d
which represent different events).
In use this structure is dynamically allocated with sufficient space to hold sizeof(struct nested_event)
plus a variable-length string which is copied into .d.event1.string
.
I've recently discovered that this construct causes runtime crashes when compiling at -O1 or above due to StackGuard checking when copying data into the string member of the array using strcpy.
Here's a full test program illustrating the crash. Use of struct nested_event
triggers the crash, while use of struct flat_event
avoids it:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
struct nested_event {
int type;
union {
struct {
char weeble;
char string[0];
} event1;
} d;
};
struct flat_event {
int type;
char weeble;
char string[0];
};
int main(int argc, char **argv)
{
if (argc < 2) {
printf("Usage: %s <string>\n", argv[0]);
printf("An upper case string will cause a crash, lower case will not\n");
return 1;
} else {
if (isupper(argv[1][0])) {
struct nested_event *s = NULL;
size_t len = strlen(argv[1]);
s = calloc(1, len + 1 + sizeof(struct nested_event));
if (s) {
strcpy(s->d.event1.string, argv[1]);
printf("%s: %p %s\n", __func__, s, s->d.event1.string);
free(s);
} else {
return 1;
}
} else {
struct flat_event *s = NULL;
size_t len = strlen(argv[1]);
s = calloc(1, len + 1 + sizeof(struct flat_event));
if (s) {
strcpy(s->string, argv[1]);
printf("%s: %p %s\n", __func__, s, s->string);
free(s);
} else {
return 1;
}
}
}
return 0;
}
Compiled using gcc 4.8.4 on Ubuntu 14.04 as follows:
gcc -Werror -Wall -O1 test.c
The test program then crashes when passed a string with an upper-case first letter:
$ ./a.out Foobar
*** buffer overflow detected ***: ./a.out terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7f3d2c55d29f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f3d2c5f883c]
/lib/x86_64-linux-gnu/libc.so.6(+0x10d710)[0x7f3d2c5f7710]
./a.out[0x4007bd]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f3d2c50bf45]
./a.out[0x400669]
======= Memory map: ========
00400000-00401000 r-xp 00000000 08:11 55318006 /home/tom/src/prol2tp/l2tp2/a.out
00600000-00601000 r--p 00000000 08:11 55318006 /home/tom/src/prol2tp/l2tp2/a.out
00601000-00602000 rw-p 00001000 08:11 55318006 /home/tom/src/prol2tp/l2tp2/a.out
022a5000-022c6000 rw-p 00000000 00:00 0 [heap]
7f3d2c2d4000-7f3d2c2ea000 r-xp 00000000 08:02 2364699 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f3d2c2ea000-7f3d2c4e9000 ---p 00016000 08:02 2364699 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f3d2c4e9000-7f3d2c4ea000 rw-p 00015000 08:02 2364699 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f3d2c4ea000-7f3d2c6a8000 r-xp 00000000 08:02 2363138 /lib/x86_64-linux-gnu/libc-2.19.so
7f3d2c6a8000-7f3d2c8a8000 ---p 001be000 08:02 2363138 /lib/x86_64-linux-gnu/libc-2.19.so
7f3d2c8a8000-7f3d2c8ac000 r--p 001be000 08:02 2363138 /lib/x86_64-linux-gnu/libc-2.19.so
7f3d2c8ac000-7f3d2c8ae000 rw-p 001c2000 08:02 2363138 /lib/x86_64-linux-gnu/libc-2.19.so
7f3d2c8ae000-7f3d2c8b3000 rw-p 00000000 00:00 0
7f3d2c8b3000-7f3d2c8d6000 r-xp 00000000 08:02 2363612 /lib/x86_64-linux-gnu/ld-2.19.so
7f3d2caa4000-7f3d2caa7000 rw-p 00000000 00:00 0
7f3d2cad2000-7f3d2cad5000 rw-p 00000000 00:00 0
7f3d2cad5000-7f3d2cad6000 r--p 00022000 08:02 2363612 /lib/x86_64-linux-gnu/ld-2.19.so
7f3d2cad6000-7f3d2cad7000 rw-p 00023000 08:02 2363612 /lib/x86_64-linux-gnu/ld-2.19.so
7f3d2cad7000-7f3d2cad8000 rw-p 00000000 00:00 0
7ffff7e5a000-7ffff7e7b000 rw-p 00000000 00:00 0 [stack]
7ffff7ee4000-7ffff7ee6000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted (core dumped)
Passing a string with a lower-case first letter triggers use of the flat event structure, and that doesn't trigger a crash:
$ ./a.out foobar
main: 0x1dc8010 foobar
This crash is similar to the one described by this question, although not exactly the same since the code in the linked question doesn't have a zero-length array in the structure.
The use of a zero-length array as the last member of a structure is not atypical in C code (although I understand the usage has changed through the versions of the C standard), and is supported by GCC as an extension, in particular:
A structure containing a flexible array member, or a union containing such a structure (possibly recursively), may not be a member of a structure or an element of an array. (However, these uses are permitted by GCC as extensions.)
So on the face of it, this use is reasonable. But clearly something about it is off from StackGuard's perspective!
If I build without optimisation, I don't hit this crash (and valgrind doesn't complain about out-of-bounds access). If I build with clang I can use various different optimisation levels with no problems.
I can of course work around this problem by tweaking compiler options, or by refactoring the code using the nested event structure. However, I'd like to better understand the root cause of the issue before doing so. Our project is Linux-specific, and will usually be built using gcc, although I'd prefer not to be tied to a specific toolchain.
Is the definition of struct nested_event
valid C? By adding the nested structure, are we accessing beyond the end of the object when writing to .d.event1.string and thereby causing undefined behaviour? Or alternatively, is this crash just a GCC oddity which we can work around and ignore?