11

I am referring to: Why should text files end with a newline? One of the answers quotes the C89 standard. Which in brief dictates that a file must end with a new line, which is not immediately preceded by a backslash.

Does that apply to the most recent C++ standard?

#include <iostream>
using namespace std;

int main()
{
  cout << "Hello World!" << endl;
  return 0;
}
//\

Is the above valid? (Assuming there is a newline after //\, which I've been unable to display)

Community
  • 1
  • 1
Viktor
  • 155
  • 8
  • 3
    "For consistency, it’s very helpful to follow this rule". [Historically](http://stackoverflow.com/questions/72271/no-newline-at-end-of-file-compiler-warning), the main problem was failing to add a newline after a .h file with an #endif header guard. – paulsm4 Jul 22 '15 at 22:09
  • 1
    Why not test it yourself? – Jashaszun Jul 22 '15 at 22:09
  • Hmm, I haven't ever heard anything about that. – Evan Carslake Jul 22 '15 at 22:09
  • @Jashaszun how would you test that? – iheanyi Jul 22 '15 at 22:26
  • @iheanyi Try to compile. – Jashaszun Jul 22 '15 at 22:26
  • 3
    @Jashaszun umm, so what if I happen to have a compiler that allows files that end with and without newlines? Trying to compile proves nothing. – iheanyi Jul 22 '15 at 22:27
  • @iheanyi Then either (1) the compiler follows the standard, in which case now you know that the standard doesn't care about these newlines, or (2) the compiler does not follow the standard, in which case you know nothing more. – Jashaszun Jul 22 '15 at 22:28
  • @Jashaszun Exactly. So your test does not answer the question "Does that apply to the most recent C++ standard?" since there are two possible explanations for whatever result I get. The point (by asking you how you'd test it) is that there is no possible test, unless you have a compiler certified by a standard body that it correctly implements that part of the standard, to test your way to the right answer. – iheanyi Jul 22 '15 at 22:30
  • 2
    @Jashaszun: The usual rule of thumb with standards is "be lax in what you accept. Be strictly conformant in what you generate". So a compiler that accepted non-compliant input when there was no ambiguity wouldn't be doing a "bad job". Although it ideally should warn about any non-standard anything it accepted. – Peter Cordes Jul 22 '15 at 22:31
  • @iheanyi Well, I guess I just assumed that all of the major compilers adhered to the standard. – Jashaszun Jul 22 '15 at 22:32
  • @PeterCordes I didn't realize that, sorry. I thought that all of the major compilers exactly adhere to the standards. – Jashaszun Jul 22 '15 at 22:33
  • 2
    @Jashaszun Umm. . . that should be a weak assumption. As in, most of the time, I can assume if there is a mistake, it is me who made it and not the compiler. But, I don't think there is a single compiler that has ever 100% adhered to the standard. – iheanyi Jul 22 '15 at 22:33
  • see also [Backslash newline at end of file warning](http://stackoverflow.com/a/26127812/1708801) – Shafik Yaghmour Jul 23 '15 at 00:14

2 Answers2

9

The given code is legal in the case of C++, but not for C.

Indeed, the C (N1570) standard says:

Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place.

The C++ standard (N3797) formulates it a bit differently (emphasis mine):

Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. If, as a result, a character sequence that matches the syntax of a universal-character-name is produced, the behavior is undefined. A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
AlexD
  • 32,156
  • 3
  • 71
  • 65
  • 1
    TL;DR - In C, each source file must end in a newline. In C++, the compiler will imagine the final newline even if the programmer hasn't provided one. – Jirka Hanika Jun 19 '19 at 11:52
1

As per [lex.phases] p2 and p3, your particular case is also ill-formed in c++ standard. [lex.phases] p2 says

Each sequence of a backslash character () immediately followed by zero or more whitespace characters other than new-line followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. Except for splices reverted in a raw string literal, if a splice results in a character sequence that matches the syntax of a universal-character-name, the behavior is undefined. A source file that is not empty and that does not end in a new-line character, or that ends in a splice, shall be processed as if an additional new-line character were appended to the file.

Since you said

Assuming there is a newline after //, which I've been unable to display

Hence, the last visible \ is eligible as a splice. So, the sequence consisted of \ and the new-line character is deleted. It means the last character in this source file is / but without being followed by a newline. // starts a comment according to [lex.comment] p1

The characters // start a comment, which terminates immediately before the next new-line character.

As per [lex.phases] p3

The source file is decomposed into preprocessing tokens ([lex.pptoken]) and sequences of whitespace characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment.

In your case, the characters // start a comment but have no new line to terminate it. Hence, it's a partial comment. The program is ill-formed.

xmh0511
  • 7,010
  • 1
  • 9
  • 36