9

For example: In the following code, how and where is the number '10' used for the comparison stored?

#include<stdio.h>
#include<conio.h>

int main()
{
    int x = 5;
    if (x > 10)
        printf("X is greater than 10");
    else if (x < 10)
       printf("X is lesser than 10");
    else
        printf("x = 10");
    getch();
    return 0;
}

Pardon me for not giving enough details. Instead of initializing 'x' directly with '5', if we scan and get it from the user we know how memory is allocated for 'x'. But how memory is allocated for the literal number '10' which is not stored in any variable?

Vivek A
  • 107
  • 2
  • 2
    Not related to your code, but "X is less than 10" is the grammatically-correct way to say that. – cjm Sep 04 '15 at 21:27
  • Aside: "By the *language*", not at all. That's a job for *implementors* to figure out. –  Sep 05 '15 at 04:11
  • Aside #2: Your question should probably be present in your question, even if it happens to be in the title too. –  Sep 05 '15 at 04:12
  • `x` **is** declared, what do you actually mean? And what do you mean with "entity"? Please read about _declaration_ and _definition_. These have well-defined meanings, not only in C. – too honest for this site Sep 05 '15 at 13:17
  • Why this question and the answers have been migrated from Programmers?* I feel they should have stayed there... – Basile Starynkevitch Sep 05 '15 at 13:28
  • 1
    @BasileStarynkevitch I would assume it's because this is asking about the technical implementation details of a language, rather than about code quality or design trade-offs for software written in that language. – Ixrec Sep 05 '15 at 13:42

2 Answers2

33

In your particular code, x is initialized to 5 and is never changed. An optimizing compiler is able to constant fold and propagate that information. So it probably would generate the equivalent of

int main() {
 printf("X is lesser than 10");
 getch();
 return 0;
}

notice that the compiler would also have done dead code elimination.

So both constants 5 and 10 would have disappeared.

BTW, <conio.h> and getch are not in standard C99 or C11. My Linux system don't have them.

In general (and depending upon the target processor's instruction set and the ABI) small constants are often embedded in some single machine code instruction (as an immediate operand), as Kilian answered. Some large constants (e.g. floating point numbers, literal strings, most const global or static arrays and aggregates) might get inserted and compiled as read only data in the code segment (then the constant inside machine register-load instructions would be an address or some offset relative to PC for PIC); see also this. Some architectures (e.g. SPARC, RISC-V, ARM, and other RISC) are able to load a wide constant in a register by two consecutive instructions (loading the constant in two parts), and this impacts the relocation format for the linker (e.g. in object files and executables, often in ELF).

I suggest to ask your compiler to emit assembler code, and have a glance at that assembler code. If using GCC (e.g. on Linux, or with Cygwin or MinGW) try to compile with gcc -Wall -O -fverbose-asm -S ; on my Debian/Linux system if I replace getch by getchar in your code I am getting:

        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "X is lesser than 10"
        .text
        .globl  main
        .type   main, @function
main:
.LFB11:
        .cfi_startproc
        subq    $8, %rsp        #,
        .cfi_def_cfa_offset 16
        movl    $.LC0, %edi     #,
        movl    $0, %eax        #,
        call    printf  #
        movq    stdin(%rip), %rdi       # stdin,
        call    _IO_getc        #
        movl    $0, %eax        #,
        addq    $8, %rsp        #,
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE11:
        .size   main, .-main
        .ident  "GCC: (Debian 4.9.2-10) 4.9.2"
        .section        .note.GNU-stack,"",@progbits

If you are using a 64 bits Windows system, your architecture is very likely to be x86-64. There are tons of documentation describing the ISA (see answers to this) and the x86 calling conventions (and also the Linux x86-64 ABI; you'll find the equivalent document for Windows).

BTW, you should not really care how such constants are implemented. The semantics of your code should not change, whatever the compiler choose to do for implementing them. So leave the optimizations (and such low level choices) to the compiler (i.e. your implementation of C).

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • I find https://gcc.godbolt.org often useful for looking at the source and different compilers online - a gcc fiddle site in essence. With no optimizations: https://gcc.godbolt.org and with `-O2`: https://goo.gl/GQ7YFn –  Sep 04 '15 at 15:23
  • @MichaelT: I know about `gccbot` but I prefer to copy code in the answer. – Basile Starynkevitch Sep 04 '15 at 17:38
  • ...and some compilers are *special* and [spill their constants onto the stack](https://lkml.org/lkml/2014/7/24/584). – Kevin Sep 04 '15 at 18:05
  • 1
    @Kevin I'm not sure something that appears to be a compiler bug is relevant here. – Dan Is Fiddling By Firelight Sep 04 '15 at 19:20
  • +1 for advising (A) to not care/to trust the compiler and (B) to check the generated machine code if you still have questions. Though I suppose the latter might be a daunting prospect to new learners, but there's no better time to start :-) – underscore_d Sep 04 '15 at 21:25
  • Worth pointing out, I think? The string `"X is lesser than 10"`is itself a “non-declared entity,” too large to be folded into an immediate constant, and as we can see, it goes into a section of the program whose name starts with `.rodata` for read-only data. When the program needs to refer to that, it loads the address of that string, which has the label `.LC0`, into a register. The memory’s not really managed because it’s a constant that never has to be allocated or freed. – Davislor Sep 05 '15 at 09:49
13

The constant 10 is probably stored as an immediate constant in the opcode stream. Issuing a CMP AX,10, with the constant included in the opcode, is usually both smaller and faster than a CMP AX, [BX], where the comparison value must be loaded from memory.

If the constant is too large to fit into the opcode, the alternative is to store it in memory like a static variable, but if the instruction set allows embedded constants, a good compiler should use it - after all, that addressing mode was presumably added because it has advantages over the others.

Kilian Foth
  • 13,904
  • 5
  • 39
  • 57
  • 4
    "the alternative is to store it in memory like a static variable" - there are more alternatives. You might emit one or two instructions to construct the correct value in a register (for example as 7+3, if 10 is too big for an immediate because the limit is 7), followed by the comparison. – Steve Jessop Sep 04 '15 at 11:20