80

I'm trying to compile and run following program without main() function in C. I have compiled my program using the following command.

gcc -nostartfiles nomain.c

And compiler gives warning

/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400340

Ok, No problem. then, I have run executable file(a.out), both printf statements print successfully, and then get segmentation fault.

So, my question is, Why segmentation fault after successfully execute print statements?

my code:

#include <stdio.h>

void nomain()
{
        printf("Hello World...\n");
        printf("Successfully run without main...\n");
}

output:

Hello World...
Successfully run without main...
Segmentation fault (core dumped)

Note:

Here, -nostartfiles gcc flag prevents the compiler from using standard startup files when linking

msc
  • 33,420
  • 29
  • 119
  • 214
  • 38
    I'm surprised this works at all. Frankly, I consider this treatment by the linker to be erroneous (or at least a Bad Thing): there was no entry point, so the linker just hallucinated it from whatever function was handy. Blech. – geometrian Feb 20 '17 at 07:55
  • 4
    @imallett, at least the linker was kind enough to draw attention to it with a warning and to explain what fallback action it was taking! You're right that this might be better as an error rather than just a warning, though. – Toby Speight Feb 20 '17 at 11:59
  • Why would you use no main? – Pieter B Feb 20 '17 at 12:12
  • 4
    @PieterB - Not overly relevant to a discussion about unices, but the entry point for Windows programs isn't necessarily `main`, but `WinMain` or `wWinMain`. – StoryTeller - Unslander Monica Feb 20 '17 at 12:19
  • @StoryTeller actually in both Windows and Linux you can set arbitrary entry point: for Linux's `ld` it would be `-e` option, for Windows' MSVC linker it'd be `/ENTRY` option. – Ruslan Sep 27 '17 at 09:27

2 Answers2

132

Let's have a look at the generated assembly of your program:

.LC0:
        .string "Hello World..."
.LC1:
        .string "Successfully run without main..."
nomain:
        push    rbp
        mov     rbp, rsp
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        mov     edi, OFFSET FLAT:.LC1
        call    puts
        nop
        pop     rbp
        ret

Note the ret statement. Your program's entry point is determined to be nomain, all is fine with that. But once the function returns, it attempts to jump into an address on the call stack... that isn't populated. That's an illegal access and a segmentation fault follows.

A quick solution would be to call exit() at the end of your program (and assuming C11 we might as well mark the function as _Noreturn):

#include <stdio.h>
#include <stdlib.h>

_Noreturn void nomain(void)
{
    printf("Hello World...\n");
    printf("Successfully run without main...\n");
    exit(0);
}

In fact, now your function behaves pretty much like a regular main function, since after returning from main, the exit function is called with main's return value.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
  • 6
    I think there are some architecture/OS combinations where you can just "return" out of a program; MS-DOS .COM executables? Anyway, we're deep into implementation-specific behaviour. – pjc50 Feb 20 '17 at 10:08
  • 4
    @pjc50 - We are indeed. Although the path in the OP suggested a Unix variant. That coupled with the popularity of certain architectures and instruction sets was the only reason I felt comfortable to present generated assembly in the answer. – StoryTeller - Unslander Monica Feb 20 '17 at 10:14
  • 1
    Just an observation. `-nostartfiles` can also render the C library unusable. Without the _C_ startup executed subsequent calls to the _C_ library functions may fail unexpectedly. On Linux if you were to compile with `-nostartupfiles` and `-static` you may discover the program will fault. There are _C_ libraries like MUSL that don't require up front initialization that are designed to work in this environment. – Michael Petch Jan 03 '18 at 09:35
22

In C, when functions/subroutines are called the stack is populated as (in the order):

  1. The arguments,
  2. Return address,
  3. Local variables, --> top of the stack

main() being the start point, ELF structures the program in such a way that whatever instructions comes first would get pushed first, in this case printfs are.

Now, program is sort of truncated without return-address OR __end__ and infact it assumes that whatever is there on the stack at that(__end__) location is the return-address, but unfortunately its not and hence it crashes.

Milind Deore
  • 2,887
  • 5
  • 25
  • 40
  • 4
    Is the order of stack data defined by the C standard? I thought it was up to the system architecture – Délisson Junio Feb 19 '17 at 18:10
  • 1
    Thats why i mentioned ELF(executable and linkable file format), this is generated by cross-compiling for a specific ARCH type on the required OS. – Milind Deore Feb 19 '17 at 18:25
  • 1
    To be picky, you can use the ELF format even on systems with no stack. One example of such a system is [Freescale RS08](https://en.wikipedia.org/wiki/Freescale_RS08) with the Codewarrior compiler, which generates ELF linker files. – Lundin Oct 11 '17 at 12:47