5

Well, i want to know which is the order that a compiler "read" the code. For example:

Suppose I have the following code snippet:

int N, M;

N = M = 0;

In this case, the compiler would separete a part of memory (int, 4 bytes) for N and M, and then, at the second line (where comes my doubt), of two things, one:

The compiler "read" N equals to M and equals both to zero.

OR

The compiler "reads" the zero, put it in memory of M, then get the value of M, that is zero, and put it on memory of N.

In other words, it is from right to left, or from left to right ?

I don't know if became clear my doubt, but in a test that i made:

int i=0; /*I declared the variable i, and assign zero value to it*/

printf("%d", i++); /*Prints 0*/

printf("%d", i); /*Prints 1*/

I understand the above code, at the second line, the compiler seems(from what i undestood)"read" from left to right, assigning to the type %d the i value, and after print, the variable i is incremented, because at the third line it is printed as 1.

The code snippet below, reverses the position of the ++:

int i=0; /*I declared i variable to zero*/

printf("%d", ++i); /*Prints 1*/

printf("%d", i); /*Prints 1*/

In this case, at the second line, (from what i understood) the compiler "reads" from left to right,and when the compiler reads what will be printed (that stay after the comma, what is the name of this space?), first "reads" the ++ and increments the variable below that is i in this case, and then assign to %d to be printed.

In order, which is the order that a compiler "reads" ? I had some teachers that told me the compiler "read" from right to left, from the semicolon(;), but the compiler actually has an order? And if something that i told above are wrong, please, correct me.(I don't speak english very well)

Thanks!

vitaut
  • 49,672
  • 25
  • 199
  • 336
ViniciusArruda
  • 970
  • 12
  • 27
  • I can't think of any possible functional difference between N=M=0 and M=N=0. Is this "just curious" what the compiled code looks like? – danh Dec 20 '13 at 23:52
  • Ok, my doubt is about C, i use GCC as compiler, and do you know when the compiler has some order to read ? – ViniciusArruda Dec 20 '13 at 23:54
  • i presume that you want to know how the runtime does it? – Sam I am says Reinstate Monica Dec 20 '13 at 23:54
  • assignment is evaluated in the opposite order that everything else is. so `m=0` happens first, and `n=m` after that – Sam I am says Reinstate Monica Dec 20 '13 at 23:55
  • 1
    As far as I understand your examples, you're not actually asking about how the compiler reads/compiles things, but how a statement like `N=M=0;` is interpreted. The assignment-operator groups right-to-left, i.e., this statement is interpreted as `N=(M=0);` – dyp Dec 20 '13 at 23:57
  • there is also _the as-if rule_ to consider, I don't think that a compiler can show a deterministic behaviour in relation to the source code that it's processing . – user2485710 Dec 20 '13 at 23:58
  • The second+third examples, `++i` vs. `i++` is special. `++i` is a so-called *preincrement*, which increments `i` and yields the incremented value; whereas `i++` is a *postincrement*, which also increments `i` but yields the non-incremented value. Those are two different operators. – dyp Dec 21 '13 at 00:02
  • 1
    The effective order of processing is defined (or left undefined) by the language spec. (And the rules for C are sufficiently vague and full of pitfalls that it's generally best to not do multiple assignments and the like.) – Hot Licks Dec 21 '13 at 00:19
  • thank you for the answers, but if i have a code snippet like that : "for(i=0; i<10; i++)", the compiler reads the i=0 once, then check if i is under 10, than increments the i, but dont use the increment value, is it right ? – ViniciusArruda Dec 21 '13 at 00:30
  • Or the compiler reads the "for()" operation in different orders ? – ViniciusArruda Dec 21 '13 at 00:33
  • 1
    @X0R40: Some of what various comments and answers are trying to point out to you is that your questions are phrased as “What does the **compiler** do?” However, the compiler is only a translator; it does not execute the program. The compiler **reads** and **analyzes** the program to figure out how to translate it into machine language. This analysis is performed in a way that makes sense for understanding the program. When the program **executes**, it performs operations in a different order. In spite of your phrasing, it seems like you are asking more about execution than about translation. – Eric Postpischil Dec 21 '13 at 00:40
  • @X0R40: are you assuming a compiler compiles "what it sees" directly into immediate code? That may have been true for the very first generation(s) of C compilers. See e.g. http://blogs.msdn.com/b/vcblog/archive/2006/08/16/thoughts-on-the-visual-c-abstract-syntax-tree-ast.aspx for a modern take. – Jongware Dec 21 '13 at 00:41
  • 2
    @X0R40: Another complication is that the meaning of C programs is that they specify how execution proceeds in an abstract computer. When the compiler translates the program, it may produce a program that executes operations in a different way than the abstract computer, as long as it gets the same results. So there are differences between what operations a program performs in the abstract computer and what operations it performs in a real computer. – Eric Postpischil Dec 21 '13 at 00:41
  • Note that the *compiler* effectively reads everything only once. The source code is read in, checked for correctness, and converted to data structures, and then those data structures are rearranged to make the code more efficient. Finally, the compiler "walks" the structures and generates machine code to accomplish what the structures represent. It is that machine code that executes you program, long after the compiler has left the scene. – Hot Licks Dec 21 '13 at 01:24

4 Answers4

3
I understand the above code, at the second line, the compiler seems(from what i undestood)"read" from left to right, assigning to the type %d the i value, and after print, the variable i is incremented, because at the third line it is printed as 1.

That is not the order things are being done in.

When you call i++, i is incremented by 1. The value before the increment happened is then returned.

So your: printf("%d", i++)

is actually doing:

int i = somevalue;
int temp = i;
i = i + 1;
printf("%d", temp);

I is NOT incremented after the printing. It was actually before printing.

and when you do: printf("%d", ++i)

it does:

int i = somevalue;
i = i + 1;
printf("%d", i);

The difference is the temp variable. Your teacher is right in this case that it is going from right to left from the semicolon because:

It has to process the value of I first. Then it can print it. In reality, if you break it all up into a separate line per operation or instruction, it's actually going from top to bottom as shown above.

Brandon
  • 22,723
  • 11
  • 93
  • 186
  • thank you for the answers, but if i have a code snippet like that : "for(i=0; i<10; i++)", the compiler reads the i=0 once, then check if i is under 10, than increments the i, but dont use the increment value, is it right ? Or the compiler reads the "for()" operation in different orders ? – ViniciusArruda Dec 21 '13 at 00:40
  • At the beginning of the loop, it will set I to a value of 0. It will check if I < 10. If it is, it will run the body of the loop. At the end of the loop, it will increment I. The loop will then start from the top again and it will check if I < 10. In a for-loop declaration, I++ vs ++I makes no difference because the value is incremented at the end of the loop. If you are using variables other than Integers, it is better to use ++I. This will get rid of the unused temporary variable. – Brandon Dec 21 '13 at 01:02
2

According to the C++ Standard (and the C Standard)

The assignment operator (=) and the compound assignment operators all group right-to-left.

So in this statement

N = M = 0;

the compiler assigna 0 to M and accotding to the Standard

return an lvalue referring to the left operand

That is in your example 0 that in tuen is assigned to N.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • It should be noted in this case that because of the allowances for compiler optimization, the value of M does not have to be read and assigned to N. 0 can be assigned to both M and N simultaneously or in any order, because the end result is precisely the same as if 0 was assigned to M and then M was assigned to N. – OmnipotentEntity Dec 21 '13 at 00:18
  • @OmnipotentEntity, You are right but it is important to undersytane the sequence of operations. – Vlad from Moscow Dec 21 '13 at 00:26
  • @OmnipotentEntity: In this case, the ordering and precise semantics do not matter. But note, for example, that in `float f; int i; f = i = 3.5;`, the C standard specifies that, in the C model of computation, 3.5 is converted to `int`, assigned to `i`, and then this value, 3, not 3.5, is assigned to `f`. The value of an assignment is the value that was assigned, not the value on the right-hand side. – Eric Postpischil Dec 21 '13 at 00:31
  • @EricPostpischil, that's certainly true! The important part of the comment, which I should have bolded, was the as if clause. :) – OmnipotentEntity Dec 21 '13 at 00:34
2

Only the lexical anaylizer reads the source code left to right. The grammar parser builds AST's and reads them in a variety of ways, depending on the particular node it finds.

For expressions, the AST holding the expression may be read in post-order, to generate a postfixed expression, more suitable for a stack based evaluator (easiest to implement).

For assignments (which are indeed expressions), the AST reads and generates code first the RHS, then for the LHS, and then generates the write-to-memory instruction.

For a function call, the AST may be parsed from the last node containing an argument expression to the first one (if using the C calling convention to perform the call).

mcleod_ideafix
  • 11,128
  • 2
  • 24
  • 32
  • Do you have some material or website link that i can study about what you said ? I never listened about AST, RHS and LHS, i am researching it now. Thanks ! – ViniciusArruda Dec 21 '13 at 01:51
  • The reference that comes to my mind now is the book "Compilers: Principles, Techniques and Tools." – mcleod_ideafix Dec 21 '13 at 08:51
2

For empirical evidence, I compiled: int M,N; N = M = 0;

with no compiler options, on my mac and disassembled it:

0000000100000f10    pushq   %rbp
0000000100000f11    movq    %rsp,%rbp
0000000100000f14    movl    %edi,0xfc(%rbp)
0000000100000f17    movq    %rsi,0xf0(%rbp)
0000000100000f1b    movl    $0x00000000,0xe4(%rbp)
0000000100000f22    movl    0xe4(%rbp),%eax
0000000100000f25    movl    %eax,0xe0(%rbp)
0000000100000f28    movl    $0x00000000,0xe8(%rbp)
0000000100000f2f    movl    0xe8(%rbp),%eax
0000000100000f32    movl    %eax,0xec(%rbp)
0000000100000f35    movl    0xec(%rbp),%eax
0000000100000f38    popq    %rbp
0000000100000f39    ret

So it looks like the compiler has decided to compile it as: N = 0; M = 0;

David
  • 528
  • 3
  • 17
  • Hmm... I expected a `xor 0xe4(%rbp), 0xe4(%rbp)` is it not possible for xor to operate on the stack like that? – OmnipotentEntity Dec 21 '13 at 00:41
  • how you did it ? How can i see this instructions ? I use gcc when in ubuntu and cygwin when in windows, i know just C, but i think it is assembly ? – ViniciusArruda Dec 21 '13 at 00:49
  • @X0R40 I used otool on my mac. I refer you to here for your linux options: http://stackoverflow.com/questions/840321/how-can-i-see-the-assembly-code-for-a-c-program – David Dec 22 '13 at 14:27
  • @OmnipotentEntity I don't have a definitive answer to your question, but I would guess that xor is wired in the processor to work strictly from the registers. – David Dec 22 '13 at 14:28
  • thanks, I will try it, but, one doubt: the compiler always will generate the assembly code then the machine code(I mean 1s and 0s), or the compiler never translate to assembly code, just to machine code ? – ViniciusArruda Dec 22 '13 at 18:23
  • @X0R40 The compiler will, under normal circumstances, output assembly in your executable files. I refer you to, eg the ELF wiki, http://en.wikipedia.org/wiki/Executable_and_Linkable_Format, to understand better how your programs are presented to the operating system for execution. – David Dec 26 '13 at 02:09