3

I am writing a parser (for NMEA sentences) which splits a string on commas using strsep. When compiled with clang (Apple LLVM version 10.0.1), the code segfaults when splitting a string which has an even number of tokens. When compiled with clang (version 7.0.1) or gcc (9.1.1) on Linux the code works correctly.

A stripped down version of the code which exhibits the issue is as follows:

#include <stdio.h>
#include <stdint.h>
#include <string.h>

static void gnss_parse_gsa (uint8_t argc, char **argv)
{

}

/**
 *  Desciptor for a NMEA sentence parser
 */
struct gps_parser_t {
    void (*parse)(uint8_t, char**);
    const char *type;
};

/**
 *  List of avaliable NMEA sentence parsers
 */
static const struct gps_parser_t nmea_parsers[] = {
    {.parse = gnss_parse_gsa, .type = "GPGSA"}
};

static void gnss_line_callback (char *line)
{
    /* Count the number of comma seperated tokens in the line */
    uint8_t num_args = 1;
    for (uint16_t i = 0; i < strlen(line); i++) {
        num_args += (line[i] == ',');
    }

    /* Tokenize the sentence */
    char *args[num_args];
    for (uint16_t i = 0; (args[i] = strsep(&line, ",")) != NULL; i++);

    /* Run parser for received sentence */
    uint8_t num_parsers = sizeof(nmea_parsers)/sizeof(nmea_parsers[0]);
    for (int i = 0; i < num_parsers; i++) {
        if (!strcasecmp(args[0] + 1, nmea_parsers[i].type)) {
            nmea_parsers[i].parse(num_args, args);
            break;
        }
    }
}

int main (int argc, char **argv)
{
    char pgsa_str[] = "$GPGSA,A,3,02,12,17,03,19,23,06,,,,,,1.41,1.13,0.85*03";
    gnss_line_callback(pgsa_str);
}

The segfault occurs at on the line if (!strcasecmp(args[0] + 1, nmea_parsers[i].type)) {, the index operation on args attempts to deference a null pointer.

Increasing the size of the stack, either by manually editing the assembly or adding a call to printf("") anywhere in the function makes it no longer segfault, as does making the args array bigger (eg. adding one to num_args).

In summary, any of the following items prevent the segfault:
- Using a compiler other than clang 10
- Modifying the assembly to make the stack size before dynamic allocation 80 bytes or more (compiles to 64)
- Using an input string with an odd number of tokens
- Allocating args as a fixed length array with the correct number of tokens (or more)
- Allocating args as a variable length array with at least num_args + 1 elements
Note that when compiled with clang 7 on Linux the stack size before dynamic allocation is still 64 bytes, but the code does not segfault.

I'm hoping that someone might be able to explain why this happens, and if there is any way I can get this code to compile correctly with clang 10.

  • Are you sure you don't want to do char `*args[num_args+1];`? Precision is good, but I find it often saves time to give myself a bit of slop in case I've got an off-by-one error somewhere. And in fact I'm pretty sure you *do* have an off-by-one error: after making N trips (that is, `num_args` trips) through the `strsep` loop, you start making an N+1st loop, the one where you discover that `strsep` returned NULL. But you've stored that NULL pointer in `args[num_args]`, thus overflowing it. – Steve Summit Jul 12 '19 at 04:42
  • 1
    On a side note: Thanks very much for presenting a stripped-down version of your code! Many new contributors overlook this step. – Steve Summit Jul 12 '19 at 11:07

1 Answers1

4

When all sorts of barely-relevant factors like the specific version of the compiler seem to make a difference, it's a pretty sure sign you've got undefined behavior somewhere.

You correctly count the commas to predetermine the exact number of fields, num_args. You allocate an array just barely big enough to hold those fields:

char *args[num_args];

But then you run this loop:

for (uint16_t i = 0; (args[i] = strsep(&line, ",")) != NULL; i++);

There are going to be num_args number of trips through this loop where strsep returns non-NULL pointers that get filled in to args[0] through args[num_args-1], which is what you intended, and which is fine. But then there's one more call to strsep, the one that returns NULL and terminates the loop -- but that null pointer also gets stored into the args array also, specifically into args[num_args], which is one cell off the end. Array overflow, in other words.

There are two ways to fix this. You can use an additional variable so you can capture and test strsep's return value before storing it into the args array:

char *p;
for (uint16_t i = 0; (p = strsep(&line, ",")) != NULL; i++)
    args[i] = p;

This also has the side benefit that you have a more conventional loop, with an actual body.

Or, you can declare the args array one bigger than it strictly has to be, meaning that it's got room for that last, NULL pointer stored in args[num_args]:

char *args[num_args+1];

This has the side benefit that you always pass a "NULL terminated array" to the parsing functions, which can be handy for them (and which ends up matching, as it happens, the way main gets called).

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Thank you! I've use this code in several projects, not sure how it was never an issue until now, but it seems that writing beyond the end of the array was the problem. I still don't see how it managed to fail it quite the way it did though. Do you have any idea how it might have caused `args` to become a NULL pointer? – Samuel Dewan Jul 13 '19 at 00:29
  • @SamuelDewan Yes, it's amazing how long you can sometimes, miraculously get away with *totally* broken code -- this has happened to me as well. But, no, I have no idea why you might have seen the particular failure mode you did -- as compilers have gotten more sophisticated, their choices (especially in the face of undefined behavior, where they're allowed to do anything) can be incredibly inscrutable. – Steve Summit Jul 13 '19 at 21:55