I am writing a parser (for NMEA sentences) which splits a string on commas using strsep. When compiled with clang (Apple LLVM version 10.0.1), the code segfaults when splitting a string which has an even number of tokens. When compiled with clang (version 7.0.1) or gcc (9.1.1) on Linux the code works correctly.
A stripped down version of the code which exhibits the issue is as follows:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
static void gnss_parse_gsa (uint8_t argc, char **argv)
{
}
/**
* Desciptor for a NMEA sentence parser
*/
struct gps_parser_t {
void (*parse)(uint8_t, char**);
const char *type;
};
/**
* List of avaliable NMEA sentence parsers
*/
static const struct gps_parser_t nmea_parsers[] = {
{.parse = gnss_parse_gsa, .type = "GPGSA"}
};
static void gnss_line_callback (char *line)
{
/* Count the number of comma seperated tokens in the line */
uint8_t num_args = 1;
for (uint16_t i = 0; i < strlen(line); i++) {
num_args += (line[i] == ',');
}
/* Tokenize the sentence */
char *args[num_args];
for (uint16_t i = 0; (args[i] = strsep(&line, ",")) != NULL; i++);
/* Run parser for received sentence */
uint8_t num_parsers = sizeof(nmea_parsers)/sizeof(nmea_parsers[0]);
for (int i = 0; i < num_parsers; i++) {
if (!strcasecmp(args[0] + 1, nmea_parsers[i].type)) {
nmea_parsers[i].parse(num_args, args);
break;
}
}
}
int main (int argc, char **argv)
{
char pgsa_str[] = "$GPGSA,A,3,02,12,17,03,19,23,06,,,,,,1.41,1.13,0.85*03";
gnss_line_callback(pgsa_str);
}
The segfault occurs at on the line if (!strcasecmp(args[0] + 1, nmea_parsers[i].type)) {
, the index operation on args attempts to deference a null pointer.
Increasing the size of the stack, either by manually editing the assembly or adding a call to printf("")
anywhere in the function makes it no longer segfault, as does making the args
array bigger (eg. adding one to num_args
).
In summary, any of the following items prevent the segfault:
- Using a compiler other than clang 10
- Modifying the assembly to make the stack size before dynamic allocation 80 bytes or more (compiles to 64)
- Using an input string with an odd number of tokens
- Allocating args
as a fixed length array with the correct number of tokens (or more)
- Allocating args
as a variable length array with at least num_args + 1
elements
Note that when compiled with clang 7 on Linux the stack size before dynamic allocation is still 64 bytes, but the code does not segfault.
I'm hoping that someone might be able to explain why this happens, and if there is any way I can get this code to compile correctly with clang 10.