1

I'm reading the introduction section of the K&R book on C. To see what code format generates errors, I tried splitting printf("hello world!"); into different lines, as shown below. The problem is, I don't know if my results are implementation-independent. I used the GCC compiler.

What do C standards say about multiline expression? How do compilers deal with them?

/*
    printf("hello wor
    ld!\n");
*/

/*
    printf("hello world!
    \n");
*/

    printf("hello world!\
    n");

/*
    printf("hello world!\n
    ");
*/

    printf("hello world!\n"
    );

    printf("hello world!\n")
    ;

The commented-out expressions generate errors, while the remaining ones do not.

The behavior of the third expression was unexpected. Usually " needs to be terminated on the same line but the third expression works.

Third expression:

    printf("hello world!\
    n");

Output to console:

hello world!    n

It seems like \ can be used to split a string into multiple lines, but the space before n"); is included as part of the string. Is this a standard rule?

David Ranieri
  • 39,972
  • 7
  • 52
  • 94
Icelightz
  • 13
  • 3
  • 1
    If \ is the last character on a line (no whitespace after it) then it is treated as a continuation character. It is generally used for macros. [Multi line preprocessor macros](https://stackoverflow.com/questions/10419530/multi-line-preprocessor-macros) – Retired Ninja Aug 23 '23 at 18:18
  • 1
    Note that GCC has, for better or worse (mostly worse, IMO), relaxed rules about white space after a trailing backslash — see [Extensions: Slightly Looser Rules for Escaped Newlines](https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/Escaped-Newlines.html). This doesn't seem to be controllable via a command line option, though some combination of `-Werror`, `-pedantic` or `-pedantic-errors` might convert the warning into an error. – Jonathan Leffler Aug 23 '23 at 18:32

3 Answers3

4

C 2018 5.1.1.2

118 2. Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines.

So basically writing

    printf("hello world!\
    n");

is identical to writing

    printf("hello world!    n");
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
dave
  • 62,300
  • 5
  • 72
  • 93
2

The compiler splits each logical line that ends with the new line character '\n' into tokens.

The compiler issues an error for example for this code snippet

printf("hello wor
ld!\n");

because in each logical line (that corresponds to one or more physical lines), there are incorrect tokens "hello wor and ld!\n"

You could combine these two physical lines in one logical line by placing symbol '\' before the new line character like

printf("hello wor\
ld!\n");

The second physical line should start in position 0 to form the string literal "hello world!".

Also after splitting lines into tokens, the compiler concatenates adjacent string literal tokens. So you may write, for example

printf("hello " "world!\n");

or

printf("hello wor"
"ld!\n");
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • The sentence “The compiler splits each logical line … into tokens” is meaningless to a student before “token” is explained. – Eric Postpischil Aug 23 '23 at 20:04
1

Any two consecutive string literals will be combined. Therefore, this is completely valid C:

printf("hello world!\n"
    ""
    ""
    "next line\n"
);

...or to show a single line in multi-line:

printf("h" "e"
    "llo " 
    "world!\n"
);
Jason
  • 2,493
  • 2
  • 27
  • 27
  • 2
    String concatenation is not a preprocessor function — it is done by the compiler proper after the preprocessor has finished. See [§5.1.1.2 Translation phases](http://port70.net/~nsz/c/c11/n1570.html#5.1.1.2) — string concatenation is phase 6 but the preprocessing is complete at the end of phase 4. – Jonathan Leffler Aug 23 '23 at 18:20
  • 1
    hehe [lanuguage-lawyer] is leaking =] – Jason Aug 23 '23 at 18:21
  • @JonathanLeffler Preprocessing directives are carried out (in phase 4) before string concatenation indeed. But pedantically, 5.1.1.2 "After preprocessing, a preprocessing translation unit is called a _translation unit_". This happens in phase 7. String literal concatenation happens in phase 6. It is bit blurry where the "preprocessor" ends but I don't think it is correct to say that string concatenation isn't done as part of the preprocessing. What implementations like `gcc -E` may or may not spit out is another story. – Lundin Aug 24 '23 at 07:57