1

Project is in C. I need to parse strings that are always formatted the following way: integer, whitespace, plus sign, multi-word string, plus sign, white space, integer, whitespace, integer, end-of-line

Example: 10 +This is 1 string+ 2 -1

I'm having a hard time figuring out what to enter in the formatting of sscanf so that the string surrounded by the '+' signs get parsed correctly, without including the + signs. Assuming sscanf can be used for this case.

I tried "%d +%s+ %d %d" and that didn't work.

1 Answers1

2

You use %s but that reads up to the first white space character. You want to read a string of not-plus-signs, so say that's what sscanf() should do:

"%d +%[^+]+ %d %d"

That's a scan set — see POSIX sscanf(). You should also protect yourself from buffer overflow. If you have:

char buffer[256];

use:

"%d +%255[^+]+ %d %d"

Note the off-by-one in the lengths — this is a design feature of the scanf() family of functions. You could skip leading spaces by putting a space after the first + in the format string. It is not possible to skip trailing spaces before the second + in the data; you'll have to remove those separately.

You ask for 'end of line' after the 3rd number. That's fairly hard. You might use:

"%d +%255[^+]+ %d %d %n"

passing an extra pointer to int argument to hold the offset of the last character parsed. The blank before the %n skips white space, including newlines, so if you read into int nbytes; (passing &nbytes), then you'd check if (buffer[nbytes] != '\0') { …handle trailing garbage… } (but only after checking that you had four successful conversion specifications — %n conversion specifications are not counted in the return value from sscanf() et al). There are other solutions to that; they're all grubby to some extent.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • The way I read it the relevance of end-of-line is just what comes after the last integer, but I could be reading it wrong. Addressing it regardless adds additional learning for all. – David C. Rankin Jan 11 '20 at 05:34
  • It isn't clear what should happen if there is extra material after the third integer. Ignore would be a possibility; gripe is another. It's a reminder that input is often not as clean as you'd like or expect. – Jonathan Leffler Jan 11 '20 at 05:42
  • 1
    Ye old *"Hobson's Choice"* of answering on SO `:)` – David C. Rankin Jan 11 '20 at 06:07