1

Is it possible to substitute the number 9 with a macro to have more code maintainability in this line of code?

scanf("%9[^\n]s", str);

I tried to read the documentation, but I can't find the exact names of these operations:

  1. "[^\n]s"

  2. "%ns"

I attempted these alternatives, but Clion is marking the first occurrence of str as an error in both lines:

scanf("%" str(MAX_LENGTH) "%[^\n]s", str);

scanf("%" str(MAX_LENGTH) "[^\n]%*c", str);
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
JNeruda
  • 23
  • 1
  • 7
  • 3
    Given that you're reading the entire line, you should just use `fgets`. – user3386109 Aug 17 '23 at 21:58
  • You shouldn't have `s` after `%[^\n]`. It's not a modification of `%s` as many beginners seem to assume. – Barmar Aug 17 '23 at 22:21
  • You shouldn't have the length between two `%`. It should be `%9[...]` not `%9%[...]` – Barmar Aug 17 '23 at 22:21
  • The `%[…]` conversion specification is a "scan set" — see [§7.21.6.2 The `fscanf` function](http://port70.net/~nsz/c/c11/n1570.html#7.21.6.2) in the C standard. The `s` after it is a (common beginner's) mistake. A conversion such as `%9s` is not given a special name; it skips white space and reads a non-empty sequence of non-white-space characters. – Jonathan Leffler Aug 17 '23 at 22:25
  • 1
    My guess is that it is not so much a beginner's mistake to use `%[]s` as a mistake by an *author* of a poor quality reference, and by the nature of today's media is very hard to eradicate. Beginners can only follow material that is accessible by them. – Weather Vane Aug 17 '23 at 22:43
  • @WeatherVane — it's hard to tell where the mistake originates. You can only spot that it is a mistake reliably if you have a format string with multiple conversion specifications and you test the result. For example, `"%10[^-:]s%*[-:]s$%d"` will only match if the string has exactly 10 characters without a dash or colon, with an `s` next, and then a colon, an `s`, and a number — e.g. `abcdefghijs:s10` would work as an input, but most would fail. At this point, using `scanf()` is an exercise in perversity — it would be better to read a line and parse that (with something other than `sscanf()`). – Jonathan Leffler Aug 17 '23 at 23:00
  • The correct answer is likely to forget about this and use `fgets` instead. – Lundin Aug 18 '23 at 08:48

2 Answers2

4

The reason you get an error is str is not a builtin function nor a predefined macro. You can define str as a macro and use the stringization operator # to perform the substitution, but it is tricky and confusing: str must be defined as a macro that invokes another macro xstr, which in turns stringizes its argument with #x:

#define xstr(x)  #x
#define str(x)  xstr(x)

Note however that both of your examples have problems:

  • scanf("%" str(MAX_LENGTH) "%[^\n]s", str); has an extra s at the end of the format, which is useless and indicates a confusion between the %[...] conversion and %s, both of which require a character count prefix to prevent buffer overflow. The second % is also incorrect. Furthermore, you should not use the same identifier str for the macro and the destination array: while not an error, it makes the code unnecessarily confusing. The code should be written:

      char buf[MAX_LENGTH + 1];
      scanf("%" str(MAX_LENGTH) "[^\n]", buf);
    
  • scanf("%" str(MAX_LENGTH) "[^\n]%*c", str); has the correct form but will unconditionally consume the next byte after the match, which is not the newline character if the line has more than MAX_LENGTH bytes before the newline. No indication for this is returned to the caller.

  • %9[^\n] will fail on empty input lines because no characters match the conversion specification. scanf() will return 0 and leave the destination array in an undetermined state.

Here is a short example:

#include <stdio.h>

#define MAX_LENGTH  9

#define xstr(x)  #x
#define str(x)  xstr(x)

int main(void) {
    char buf[MAX_LENGTH + 1];
    if (scanf("%" str(MAX_LENGTH) "[^\n]", buf) == 1) {
        printf("got |%s|\n", buf);
    } else {
        printf("invalid input\n");
    }
    return 0;
}

If str was defined as #define str(x) #x, invoking str(MAX_LENGTH) would expand to "MAX_LENGTH". The second macro invocation performs its replacement after first expanding the initial macro argument, hence str(MAX_LENGTH) expands to xstr(9), which expands to "9".

Note also that MAX_LENGTH is not the length of the destination array: you must add an extra character for the null terminator, and there is no consistency check in the macro invocation: the consistency between MAX_LENGTH and the definition of buf rely entirely on the programmer.

Furthermore, if the definition of MAX_LENGTH is not an integer constant without a suffix, this macro expansion trick will fail to produce a correct scanf conversion specifier.

A more reliable approach would use snprintf to construct the scanf() format string:

#include <stdio.h>

#define MAX_LENGTH  9

int main(void) {
    char buf[MAX_LENGTH + 1];
    char format[20];
    snprintf(format, sizeof format, "%%%zu[^\n]", sizeof(buf) - 1);
    if (scanf(format, buf) == 1) {
        printf("got |%s|\n", buf);
    } else {
        printf("invalid input\n");
    }
    return 0;
}

This version works better but has its own shortcomings: it prevents the compiler from checking the consistency between the format string and the remaining scanf() arguments, which will cause a warning at recommended warning levels (-Wall -Wextra) and this consistency check is quite useful, whereas the format string to construct the format string is easy to get wrong.

In the end, both approaches are cumbersome and error prone. It is much more reliable to use fgets() for your purpose and manually remove the trailing newline:

#include <stdio.h>
#include <string.h>

#define MAX_LENGTH  9

int main(void) {
    char buf[MAX_LENGTH + 2];
    if (fgets(buf, sizeof buf, stdin)) {
        buf[strcspn(buf, "\n")] = '\0';
        printf("got |%s|\n", buf);
    } else {
        printf("no input\n");
    }
    return 0;
}

The behavior is slightly different: fgets will consume the newline unless the line is too long, which make error recovery more difficult.

A better solution overall seems to use a custom function:

#include <stdio.h>

#define MAX_LENGTH  9

/* read a line from a stream and truncate excess characters */
int get_line(char *dest, int size, FILE *fp) {
    int c;
    int i = 0, j = 0;

    while ((c = getc(fp)) != EOF && c != '\n') {
        if (j + 1 < size)
            dest[j++] = c;
        i++;
    }
    if (j < size) {
        dest[j] = '\0';
    }
    return (i == 0 && c == EOF) ? -1 : i;
}

int main(void) {
    char buf[MAX_LENGTH + 1];

    if (get_line(buf, sizeof buf, stdin) == EOF) {
        printf("invalid input\n");
    } else {
        printf("got |%s|\n", buf);
    }
    return 0;
}

Note that the behavior is still subtly different from the original scanf() call, but potentially closer to your goals:

  • get_line reads a full line, the newline and excess characters are discarded.
  • get_line always stores a C string into the destination array if size is not 0, even at end of file where buf will be an empty string. scanf() would return EOF at end of file and leave buf unchanged.
  • get_line will accept empty lines, whereas scanf() would fail, return 0 and leave buf in an undetermined state, a limitation you probably were not aware of.

Conclusion: scanf() is full of quirks and pitfalls. Trying to avoid buffer overflows with an explicit character count is a good idea, but scanf() will cause other problems that are not easily handled. Writing custom code is often required to get precise and consistent semantics.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • It's a pity that C does not provide an easy way to limit the string input for `scanf` that doesn't hard code it into the format string. MS VC's `scanf_s` tries to address this, but only leads to more confusion (because the length constraint is mandatory for `%c` and `%s` and `%[]`). – Weather Vane Aug 17 '23 at 22:53
  • @WeatherVane: a pity indeed and `scanf_s` add another layer of complexity without a practical solution for the most common problems, not to mention portability issues. – chqrlie Aug 17 '23 at 22:57
1

You can only use stringification if the macro to be expanded is a simple number (not a more general expression involving addition or subtraction, for example).

You then have to use the normal double-macro dance to get the macro expanded properly:

#define MAX_LENGTH 32
#define EXPAND_STR(x) #x
#define STR(x) EXPAND_STR(x)

and then:

char buffer[MAX_LENGTH + 1];

if (scanf("%" STR(MAX_LENGTH) "[^\n]", buffer) == 1)
    …success…
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278