0

I was thinking about implementing my own exit() function for educational purpose, only. I know you can manipulate addresses if the OS lets you (for example the OS won't let you manipulate the address 0, it would cause a crash).
So I thought why not sending 0 to that address return 0 returns to.

int main(){
// code...
return 0;
}

The return 0 returns a 'success' to the OS, right? But which address is it? How do I get it? And is the actual exit() from the C standard library implemented this way?

Natan Streppel
  • 5,759
  • 6
  • 35
  • 43
Davlog
  • 2,162
  • 8
  • 36
  • 60
  • Your question doesn't really make a lot of sense. You should try clarifying. – Tony The Lion Aug 06 '13 at 17:44
  • 4
    I think you save some misconceptions about these "adresses the OS lets you manipulate" – Borgleader Aug 06 '13 at 17:44
  • `return 0;` returns `0` from the `main` function. The C++ standard doesn't specify what address to jump to when returning from functions, but it's typically stored on the call stack or in a register. Some crt1 implementations use `exit(main(argc, argc, envp))`, so if you configure the linker correctly you could override the `exit` function. –  Aug 06 '13 at 17:44
  • 3
    The argument to `return` is not an address. – Fred Larson Aug 06 '13 at 17:44
  • 2
    There's a lot of stuff going on in between when you run your program and when `main` gets called, and then when `main` exits and when the OS gets the return status. I think this is perhaps a larger problem than you think it is. – JoshG79 Aug 06 '13 at 17:44
  • @JoshG79 your a so right. +`0` means whatever you want it to mean. Fianlly, the answer to your question is, as far as I can understand it, OS and compiler dependent. – ixe013 Aug 06 '13 at 18:11

8 Answers8

4

When you return 0, you do not return to an address. You are returning the value 0. When a process returns the value 0, it is considered to be normal termination. You can return a non-zero value (up to 255) that may be interpreted by the calling process as a message.

Let us look at this with an example command grep foobar fubar. It will return 0 (success) if there is the pattern foobar in the file fubar. It will return 1 if there is no foobar in the file fubar. It will return 2 when there is no file named fubar. The rturn value could be interpreted in the script that makes this command to evaluate the success or reason for failure.

unxnut
  • 8,509
  • 3
  • 27
  • 41
  • Well somehow the OS gets the value 0 from the program, right? Where does this value go to? – Davlog Aug 06 '13 at 17:47
  • 5
    @Davlog There is system/implementation specific code that gets executed before and after main is called. – Captain Obvlious Aug 06 '13 at 17:50
  • It is not an address, the OS initializes the program/thread on a process, when the program closes, the last thing on the program stack is the return value, the OS gets the return from that. If you had an specific address to do so you will have a lot of problems. – Lefsler Aug 06 '13 at 17:52
  • 1
    You do return to an address. Not the address `0`, of course, but an address on the stack, that was pushed when the start-up code called `main`. – James Kanze Aug 06 '13 at 17:55
  • @James Kanze i expressed myself on a wrong way.. i tried to say that its not an specific address (all programs return their values using the address 0x00011 (its an example)). Each program have their own address to do so. – Lefsler Aug 06 '13 at 17:57
  • @demonofnight so if I just change the value of that address it will still run? – Davlog Aug 06 '13 at 17:58
  • 2
    @Davlog there is no "address" you are returning a value. – Borgleader Aug 06 '13 at 18:00
  • @Borgleader Demonofnight said "Each program have their own address to do so." And every address holds a value, right? – Davlog Aug 06 '13 at 18:02
  • @Davlog, you must forget the "address" idea, on the C/C++ there is a bunch of stuffs that happen when you start a program, so let the C/C++ handle that for you. If you want to take a better look at that, dump your C/C++ code to assembler. the _start section (if i recall correctly) – Lefsler Aug 06 '13 at 18:04
  • @Davlog The return uses an specific register, registers have pointers.. so yes, it is a pointer. BUT the C/C++ handles that, "you don't have how to know what is the pointer"... – Lefsler Aug 06 '13 at 18:06
4

The exit code is (eventually) stored into the Process Control Block so that the OS can report the result value to other processes.

See http://www.cs.auckland.ac.nz/compsci340s2c/lectures/lecture06.pdf

However, the return statement isn't what does this. Your runtime library is actually calling main more-or-less like a normal function, gets the return value (on Intel, a return value of type int would be stored in the EAX register), and then requests that the kernel write it to the TCB. exit() also invokes the kernel to write this member of the TCB.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • `exit` and returning from `main` do a lot more than that. (And of course, they don't write anything directly to the PCB, supposing that the OS has one; they call an OS specific primitive with the return code which takes care of all of this.) – James Kanze Aug 06 '13 at 18:03
  • @James: Clarified that the actual write is done by the kernel, at the request of the process. – Ben Voigt Aug 06 '13 at 18:04
  • Yes. The critical part is that at some point, `exit` will invoke a kernel level primitive to tell it to terminate the process; this kernel level primitive will update the PCB (or whatever the kernel coders decided to call it), with the return code (which is passed as an argument to this kernel primitive), but also with information which will prevent the process from being scheduled; this primitive will also reclaim all of the resources of the process, and possibly record some accounting information. – James Kanze Aug 06 '13 at 18:15
  • @James: I guess all the resources *except* the TCB, which exists as a zombie process until `wait` reaps it. Closing handles is a very important part of `exit` of course. – Ben Voigt Aug 06 '13 at 18:24
  • 1
    Pretty much. `exit` flushes and closes the high level file structures, but if anything is forgotten, the system will close the low level file descriptors as part of cleaning up after the process. (I implemented a C runtime, many, many years ago, and `exit` is anything but trivial.) – James Kanze Aug 06 '13 at 18:29
2

The return 0; in main works like a return anywhere; it returns to the place from which it was called. When you start a program, the system does not start it at main, but at some start-up address which does a lot of initializations, and then something like:

exit( main(/*...*/) );

In other words, exit does not simulate a return from main; returning from main calls exit. And exit then does a lot of shutting down, before calling some system specific function which tells the system to stop the process (_exit under Unix).

You cannot implement exit yourself, because you have no way of finding the information it needs: the list of functions registered with atexit which need to be called, the list of destructors of objects with static lifetime, etc.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • «You cannot implement exit yourself» — FALSE! If somebody else did it then you can certainly do it yourself, too. After all, you can even intercept call to `exit()` and supply your own implementation. –  Aug 06 '13 at 18:05
  • @VladLazarenko You cannot implement `exit` yourself, for the reasons I enumerated. `exit` does a lot more than just return to the system; it must clean up. And the lists of things it has to do are _not_ global symbols which you can access, and do not have externally defined structures so that you know what to do about them. `exit` works with the rest of the C run time library, and to implement `exit`, you need the sources of the C run time library. – James Kanze Aug 06 '13 at 18:11
  • I think that's important to outline the fact that `atexit` it's not related to the standard library but it's implemented in the C++ ABI library of choice, in other words, it's a symbol that is not granted to be there and it's something that is platform specific. – user2485710 Aug 06 '13 at 18:16
  • Hell I can! Why not? Who do you think implemented an `exit()` function that you call, the Gods? As for the return statement in main(), see §5.1.2.2.3 Program Termination. Since C99, the return value in that case is 0. –  Aug 06 '13 at 18:19
  • @user2485710 `atexit()` is defined in §7.22.4.2 of the C standard, and by reference in §18.5 of the C++ standard. It is _not_ platform specific. (How it is implemented is, of course.) – James Kanze Aug 06 '13 at 18:24
  • @VladLazarenko "Who do you think implemented an exit() function that you call": the same people who implemented the C run time. The function is integrated into the C run time, and needs to access data structures and variables which are defined there. If you have the sources of the C runtime, and know how to link them (because you can't link them like a normal program), then you can implement `exit`. Otherwise, you cannot. – James Kanze Aug 06 '13 at 18:27
  • @JamesKanze on my linux desktop is implemented in my libsupc++ library and not in libstdc++, it's about the ABI not about the standard library. – user2485710 Aug 06 '13 at 18:29
  • So at the end of the day it turns out that I can. That's the whole point, why say «You cannot implement exit yourself». If that was true, we'd have no exit() function :) –  Aug 06 '13 at 18:30
  • @user2485710 It's specified in the C and the C++ standards as part of the standard library. Under "normal" Unix, it's implemented in `libc.so`, but that's irrelevant. It's defined by the language. – James Kanze Aug 06 '13 at 18:31
  • @VladLazarenko I said "_YOU_ cannot implement `exit` yourself", because _you_ can't. The people implementing the C run time, of course, are in a different context, where they can. – James Kanze Aug 06 '13 at 18:32
  • How do you know I do not implement a C run time? Maybe at nights when the moon is full I cannot sleep and keep crafting my own C runtime on top of my own «Big Mess of Wires» CPU? –  Aug 06 '13 at 18:39
  • @Vlad: I think the point James is trying to make is "You cannot have an application-specific `exit` implementation while still using the rest of the library, `exit` and a lot of the other runtime support must be implemented together." – Ben Voigt Aug 06 '13 at 19:37
  • @VladLazarenko Because if you did, you wouldn't ask such a question. – James Kanze Aug 07 '13 at 08:03
  • @BenVoigt Exactly. Functions like these are part of a whole, and cannot be implemented separately. In a similar manner, things like `std::type_info` cannot be implemented independently of the compiler. Unlike something like `std::string`, which is (or can be) fairly independent of the rest. – James Kanze Aug 07 '13 at 08:05
2

I think the main confusion here is the notion that that main is the first and last thing that happens in C++ program. Whilst it is [1] the first part of YOUR program, there is usually some code in the application that sets up a few things, parses command line arguments, opening/initialization of standard I/O (cin, cout, etc) and other such things, which happen BEFORE main is called. And main is essentially just another function, called by the C++ runtime functionality that does that "fix things up before main".

So, when main returns, it goes back to the code that called it, which then cleans up the things that need cleaning up (closing standard I/O channels, and many other such things), before actually finishing up by calling some OS function to "terminate this process". As part of this "terminate this process" functionality is (in most OS's) a way to signal "success or failure" to the OS, so that some other process monitoring the application can determine "if all is well or not". This is where, eventually, the 0 (or 1 if you use return 1; in main) ends up.

[1] If there are static objects with constructors that are part of the user's code, then these will be performed before any code in main [or at least, before any code in main that belongs to the user's application] is executed.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
1

Your confusion is because of not understanding what return does. Take this function for example:

int add(int x, int y)
{
   return (x + y);
}

The return in above function and the return statement at the end of your main function are exactly the same, from a language standpoint they mean the same. The meaning of that is to return an integer to the caller. What the caller makes out of this value is completely another thing which depends on the caller's intention of calling said function. Say I can call add(7, 9); to add two GPA grades while another programmer might call it to find the sum of all the money in a couple of bank accounts.

Now main is treated as a special function since it is the first function the operating system, or more specifically its loader, calls to being your program. After your program completes, whatever main returns might mean anything based on the OS's semantics. This value has nothing to do with any memory address.

Aside: According to the standard, in C++ (and C99 onawards) the return 0; statement can be omitted to mean a successful termination of the program.

legends2k
  • 31,634
  • 25
  • 118
  • 222
  • That depends on which version of C, doesn't it? – Ben Voigt Aug 06 '13 at 18:01
  • @BenVoigt: You mean the omission of `return 0;` at the end of main? – legends2k Aug 06 '13 at 18:05
  • You can safely omit main`s `return 0;` in C as well. In C99. –  Aug 06 '13 at 18:05
  • 1
    The standard requires that an implementation recognize `0` as success, as well as `EXIT_SUCCESS`. On systems such as the old DEC OS's, where odd numbers signaled success, the C run time had to remap `0` to an odd number. – James Kanze Aug 06 '13 at 18:06
  • 1
    @legends2k: Yes, that's the only reference to C in your answer and it is what I was commenting about. I found this rule "reaching the `}` that terminates the `main` function returns a value of 0." – Ben Voigt Aug 06 '13 at 18:06
  • @VladLazarenko Where does it say that in the C standard? I can't find it. – James Kanze Aug 06 '13 at 18:08
  • @JamesKanze: Thanks, removed `EXIT_SUCCESS` detail from the answer. Saw the same information [here](http://stackoverflow.com/questions/1188335/why-default-return-value-of-main-is-0-and-not-exit-success) too – legends2k Aug 06 '13 at 18:08
  • @BenVoigt: I didn't know from C99 onwards one can omit `return 0;`. I saw it written [here](http://stackoverflow.com/a/8868139/183120). – legends2k Aug 06 '13 at 18:15
  • @legends2k That's not a confirmation. It just says that someone else thinks so too (although the person in question has a tendancy to be right about such things). The confirmation is in §5.1.2.2.3 of C11. (The statement may have been present in earlier versions as well.) – James Kanze Aug 06 '13 at 18:20
  • @JamesKanze: Alright, I confirmed it now from [the publicly available C99 standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) :) And yes, the same section states it. – legends2k Aug 06 '13 at 18:22
0

If m understanding is correct, there will be a SIGCHLD signal that would be sent on exit to main shell which contains the return value... This should happen when the PCB is destroyed by the kernel...

But if you want to hook up certain functionality while exiting from the code, you can register a handler at atexit() as per POSIX implementation..

I dont think you can modify how the return valur propagates at user level, since the control of the program reaches the PC in another process ( of which you dont have access to).

Anerudhan Gopal
  • 379
  • 4
  • 13
0

If your C++ abi library implements the symbol __cxa_atexit you can use atexit

AFAIK the language doesn't really offer other safe ways to do something that is user-defined when a program stops the execution.

user2485710
  • 9,451
  • 13
  • 58
  • 102
0

When you have a function, it has a type. It can be int, void, or other. If the function is not void, then it has to return a value. In our case, the return value of main is int, which is usually a return code. The convention is that if it is 0, then there was no error, while other values are error codes.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175