1

As beginner in embedded C programming I am very curious how every (every in my experience) program execution starts with main() function? It is like the linker recognizes the main() and puts the address of that "special" function into address that the reset vector points to.

Russ Schultz
  • 2,545
  • 20
  • 22
Radoslaw Krasimirow
  • 1,833
  • 2
  • 18
  • 28
  • 2
    No - the crt initialization needs to run before main(). Just one point - in many environments, main() cannot be called if the stack pointer is not set up first. – Martin James Oct 09 '15 at 19:34
  • Are you talking embedded C or C++? The C++ language has different initialization rules than the C language. Please adjust your tags as appropriate. – Thomas Matthews Oct 09 '15 at 20:36
  • The discussions in this link have somewhat related inforamtion - http://stackoverflow.com/questions/3379190/avoiding-the-main-entry-point-in-a-c-program – Karthik Balaguru Oct 10 '15 at 09:14

6 Answers6

5

Usually a linker script creates a special section which is mapped to the reset vector and includes a jump/goto instruction to the C startup code, which, in turn, calls the main().

PineForestRanch
  • 473
  • 2
  • 11
  • I'm using code that is mapped to SRAM, not the reset vector. I download the code to SRAM using JTAG and execute the code. No reset vectors involved. – Thomas Matthews Oct 09 '15 at 20:38
  • @ThomasMatthews - Isn't JTAG normally used for debugging or project boards? If the embedded program is stand alone, then it's going to boot from some form of local memory in the device, not through the JTAG, so the reset sequence will be involved. – rcgldr Oct 09 '15 at 20:42
  • @ThomasMatthews. JTAG is a little different. You're taking control of the processor and forcing the State to a known point. Usually this point is emulating as close to the reset point as possible. In cases of processors with bootloaders, debugger will many times create the state just after leaving the bootloader. – Russ Schultz Oct 09 '15 at 20:44
  • Is my program that I downloading into memory standalone or not? I believe it is, it is not a hosted environment. – Thomas Matthews Oct 09 '15 at 20:44
  • With CPU support you can move the vector table into your code space, and this may be part of Thomas's download script. After that there is no further need for JTAG involvement. Baring cosmic rays flipping bits in the SRAM, you can reboot and go until the cows come home. – user4581301 Oct 09 '15 at 21:01
  • When you debug through JTAG it tries to emulate what the program would do if the debugger wasn't present, so it would go to the reset vector and start there. --- Of course, JTAG gives you a capability to upload your code anywhere and execute it, on some CPUs you can even run the code without uploading. But that's not how the C compiler expects it to run. – PineForestRanch Oct 09 '15 at 21:03
  • @ThomasMatthews "Stand-alone" usually means that the program can run without a debugger attached to it. That is, all code executes from flash not RAM. Not to be confused with the similar sounding "freestanding", which is the C and C++ term for bare metal embedded systems. – Lundin Oct 12 '15 at 07:37
5

C defines different specifications for code that will run in a "hosted" environment and code that will run in a "freestanding" environment. Most programmers will go their whole careers without ever having to deal with a freestanding environment, but most of the exceptions are among those who work with embedded programming, kernel programming, boot loaders, and other software that runs on bare metal.

In a hosted environment, C specifies that program execution starts with a call to main(). That does not preclude preliminary setup performed by the system before that call, but that's outside the scope of the specification. The C compiler and / or linker is responsible for arranging for that to happen; details are implementation dependent.

In a freestanding implementation, on the other hand, the program entry point is determined in a manner chosen by the implementation. There might not be a main() function, and if there is one then its signature does not need to match those permitted to programs run in hosted environments.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
2

It is not the linker, it's the processor who is deciding. On power-up the instruction pointer is set to a predefined memory address, usually the same as the reset interrupt vector. Then the linker kicks in by placing the branch instruction to the startup code at that address.

Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61
  • Not always the case. I can download a program using JTAG into memory and execute it. No predefined addresses here. No reset interrupt vectors. – Thomas Matthews Oct 09 '15 at 20:36
  • 1
    @ThomasMatthews In this case the PC is set "manually" using the JTAG facility to the start address of the code. And for the code running this way the program usually has to be linked differently than the release version – Eugene Sh. Oct 09 '15 at 20:41
2

The linker links a module for processor and runtime environment initialisation. That module is entered from the reset vector. In the gcc toolchain, the module is normally called crt0.o and is built from the source crt0.s (assembly code). Your toolchain may vary, but some sort of start-up code will be linked, and the source should be available for customisation.

The start-up code will typically perform hardware initialisation such as configuring the PLL for the desired clock speed, and initialising a memory controller if external memory is used. The C runtime initialisation requires the setting of the stack pointer, and the initialisation of global static data, and possibly runtime library initialisation - heap and stdio initialisation for example. For C++ it also invokes the constructors for any global static objects. Finally main() is called.

Note that it is not the linker specifically that knows about main(); that is simply an unresolved link in the runtime start-up module. If your program did not have a main(), it would fail to link.

You could of course modify the start-up code to use a different symbol other than main(), but main() is defined by the language standard as the entry point.

Some application frameworks or environments may appear to not have a main(); for example in the RTOS VxWorks, applications start at usrAppInit(), but in fact that is simply because main() is defined in the VxWorks library.

The linker locates the start-up code according to either directives in the assembly source, or within the linker script; toolchains may differ.

On ARM Cortex-M devices, the initial stack pointer is defined in the vector table and loaded automatically; as a consequence, it is possible for these devices to run C code directly from reset (albeit in a somewhat limited environment), and allows much of the runtime environment initialisation to be written in C rather than assembler.

Clifford
  • 88,407
  • 13
  • 85
  • 165
1

Each processor and tool chain is different. Generally, though, they're set up where the entry point to the run time library (many times _start) is reached from the reset vector. The run time library prepares the processor state, clears .bss memory, initializes .data memory, maybe sets up the heap, and calls a few call outs to allow customization of the startup, then calls all global constructors (if c++), before finally jumping to main().

It's a mix of hardware requirements, tool chain assumptions, run time library, and system code. You can trim a lot of it out, because the only real requirement for C is that you have a stack. The rest is library code you may or may not use.

Russ Schultz
  • 2,545
  • 20
  • 22
0

In order to meet the standard or at least expectations of programmers, before main you need bss cleared, compile time initialized variables (globals with an = something for example), a c library and other fun things. So you have this chicken and egg problem, how can you have C code with such assumptions or requirements and have C code that fills those requirements. you dont. There is other code, not uncommon to be assembly but could come from C where the assumptions are known to be not true. sometimes called bootstrap code. it doesnt matter if this is an embedded system or an application running on an operating system. there is some glue between the first instructions in that "program" to main. If you disassemble something gnu tools created you can see this execution path between a label named _start and main. other toolchains may or may not name their entry point differently.

in a microcontroller or situation where you might be bare metal (the bios on a pc, the startup code that launches the rtos/os) the bare minimum if you dont care about some of the requirements/assumptions of C, loading the stack pointer and branching to main is all you need. zeroing out bss and copying .data from flash to its proper home in ram, are the next two things you need to get closer to the C language requirements, and you will find those are all the steps you get in some embedded systems.

probably other processors too, but the arm cortex-m hardware has the ability to load the stack pointer and branch to an address (reset always branches to an address or runs code from some known address), further the interrupt system saves state for you so you dont need to wrap asm around interrupt service routines written in C (or do some compiler specific declaration which does the same thing)(this is the next question you would have needed ask anyway, 1) reset to C code 2) interrupts to C code), so the interrupt vector table can have addresses to C functions directly. A nice feature of that product line.

use the toolchains disassembler and examine the code from the entry point to main()...some toolchains certainly in the past, would make assuptions when it saw main() specifically and add extra code. so sometimes you see some other C function name used as the first C function to avoid the toolchain linking in other stuff.

Clifford hit the nail on the head though the linker is simply looking for unresolved symbols, one being main, with a gnu toolchain the other being _start. and it links in stuff it already knows about or you have provided on the command line until all the labels are resolved.

old_timer
  • 69,149
  • 8
  • 89
  • 168