0

Normally, I compare the number returned by sscanf with the number of fields described in the format string to detect a scanning-failure (full or partial):

   if (sscanf(line, "%s: %d", name, &value) != 2)
       errx(EX_USAGE, "cannot parse line %s", line);

However, what's one to do, when the format string itself is not known until run time (say, it is read from a config-file)?

In the below example, the format can specify up to six integers, but, if it specifies fewer, that's Ok -- as long as everything specified by the format is, actually, filled out:

    int fields[6], fieldCount;

    fieldCount = sscanf(line, format, fields, fields + 1, fields + 2,
        fields + 3, fields + 4, fields + 5);

How can one detect, that fieldCount does not match the format -- that the input line is invalid in some way?

I suppose, I can count all of the %-characters in the format (taking care to skip the %%-sequences), but that seems quite tedious. Is there another way?

Mikhail T.
  • 3,043
  • 3
  • 29
  • 46
  • 1
    Add another variable containing the number of format fields, which the provider of the format string must provide as well. – Barmar Dec 07 '21 at 21:59
  • Where do the strings actually come from? Wherever that is, have the number of expected matches also be supplied. – William Pursell Dec 07 '21 at 22:00
  • But that's quite redundant -- and prone to mismatches... Is there nothing better? – Mikhail T. Dec 07 '21 at 22:00
  • 3
    You should not be passing `scanf` a string you do not know the contents of. If it is being computed by your program or is coming from a **trusted** file, that can be made safe with good design, and you can include the number of assignments to expect with it. If it is coming from an untrusted source, you should not pass it to `scanf` without checking it up, down, sideways, backwards, and in alternate dimensions, during which you will discover the number of assignments anyway. Untrusted format strings can be exploited to misuse software. – Eric Postpischil Dec 07 '21 at 22:01
  • You need an actual configuration parser rather than some `scanf` contraption. – Cheatah Dec 07 '21 at 22:02
  • The `format` is coming from the application's configuration file, I suppose, that's trusted enough -- but let's not get sidetracked by the security argument... – Mikhail T. Dec 07 '21 at 22:03
  • 1
    @MikhailT. Let's get sidetracked by security argument. A config file is *not trusted* by any means unless it is authenticated by some cryptographic protocol. Never ever pass a format string that can be tampered with by a potential attacker. Other than that - there is no way other then supplying the number or having a side-parser which can interpret the string and derive the number of specifiers. – Eugene Sh. Dec 07 '21 at 22:06
  • Code could process the format string to determine the proper `*scanf` return value (and do type checks) - not an easy thing - yet doable with lots of code. Example [How to check that two format strings are compatible?](https://stackoverflow.com/q/28947528/2410359) – chux - Reinstate Monica Dec 07 '21 at 22:31
  • This is simply not a good application for `scanf`. If the lines being read have a regular syntax (with all N fields being whitespace-separated, say, or separated by delimiters such as `,` or `|`), you can easily read lines using `fgets`, and split each line into fields using `strtok` or the like. But `scanf`? Please no. It's already fatally crippled by numerous other shortcomings, and can only be used properly if the calling code knows exactly what the format string is. Calling `scanf` (or any of its variants) with a run-time format string is, frankly, madness. – Steve Summit Dec 07 '21 at 22:54

0 Answers0