-1

I have come across a line of code which looks like this:

if (condition)
    // Someone added a comment here
    return "something";

return "something else";

I thought that "something" would always be returned, but that's not true, despite the comment the if / condition works as the developer intended.

I've been trying to find the rules for this. Is the rule that when there is a braceless condition the next valid line of actual code is executed as if it were in braces? So I can have as many empty lines and comments as I like before the valid line of code?

jonnarosey
  • 520
  • 1
  • 8
  • 19

2 Answers2

1

The C# Language Specification says:

Conceptually speaking, a program is compiled using three steps:

  1. Transformation, which converts a file from a particular character repertoire and encoding scheme into a sequence of Unicode characters.
  2. Lexical analysis, which translates a stream of Unicode input characters into a stream of tokens.
  3. Syntactic analysis, which translates the stream of tokens into executable code.

...

This specification presents the syntax of the C# programming language using two grammars. The lexical grammar (§2.2.2) defines how Unicode characters are combined to form line terminators, white space, comments, tokens, and pre-processing directives. The syntactic grammar (§2.2.3) defines how the tokens resulting from the lexical grammar are combined to form C# programs.

We can see that tokens are combined to form the program so whatever tokens are left after previous transformations are what ends up being compiled. Your question is in regards to lexical analysis, specifically how comments, white space, and new lines affect what tokens are generated. The answer is that they don't affect them at all aside from being able to separate tokens:

Five basic elements make up the lexical structure of a C# source file: Line terminators (§2.3.1), white space (§2.3.3), comments (§2.3.2), tokens (§2.4), and pre-processing directives (§2.5). Of these basic elements, only tokens are significant in the syntactic grammar of a C# program (§2.2.3).

The lexical processing of a C# source file consists of reducing the file into a sequence of tokens which becomes the input to the syntactic analysis. Line terminators, white space, and comments can serve to separate tokens, and pre-processing directives can cause sections of the source file to be skipped, but otherwise these lexical elements have no impact on the syntactic structure of a C# program.

So your program can separate tokens by new line characters, white space characters, or comments, and it will compile the same as if they weren't there. Here are two examples I compiled separately and show the Intermediate Language output using ILSpy:

        static void Main(string[] args)
        {
            if
(true

                )

                /* comment separating `)` token from the `Console` token */

                Console.WriteLine("something") /* another comment, semicolon token to the right */;
            else                                          // bunch of white space to the left
                Console.
WriteLine("something else")



;

        }

ILSpy output for the Main() method:

        .method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2088
    // Code size 17 (0x11)
    .maxstack 1
    .entrypoint
    .locals init (
        [0] bool
    )

    IL_0000: nop
    IL_0001: ldc.i4.1
    IL_0002: stloc.0
    IL_0003: ldstr "something"
    IL_0008: call void [mscorlib]System.Console::WriteLine(string)
    IL_000d: nop
    IL_000e: br.s IL_0010

    IL_0010: ret
} // end of method Program::Main

And the cleaner one showing identical ILSpy output:

static void Main(string[] args)
{        
    if (true) Console.WriteLine("something"); else Console.WriteLine("something else");
}

ILSpy output for second version:

    .method private hidebysig static
void Main(
    string[] args
) cil managed
    {
// Method begins at RVA 0x2088
// Code size 17 (0x11)
.maxstack 1
.entrypoint
.locals init (
    [0] bool
)

IL_0000: nop
IL_0001: ldc.i4.1
IL_0002: stloc.0
IL_0003: ldstr "something"
IL_0008: call void[mscorlib]
    System.Console::WriteLine(string)
IL_000d: nop
IL_000e: br.s IL_0010


IL_0010: ret
} // end of method Program::Main
Quantic
  • 1,779
  • 19
  • 30
0

I thought that "something" would always be returned, but that's not true

No, if the condition is true then "something" is returned. It works as expected, comments are not included in the evaluation of that code. See this fiddle: https://dotnetfiddle.net/gHtu5L

Igor
  • 60,821
  • 10
  • 100
  • 175