When you are typing commands, you are using a program called a command-line shell. This is a program that reads from input, analyzes the text it receives, and executes the commands in the text.
For g++ abc.cpp
, the shell looks up g++
and finds it is the name of an executable file (either directly or because it is a link to an actual file). It then executes that file. This is a fairly complicated process that includes creating a child process that loads the executable file into memory and then executes it. (Note: Some executable files are shell scripts rather than binary executables containing machine instructions. Shell scripts are executed by loading the shell program and telling it to execute the script.)
The g++
program then analyzes the arguments it was given. In the case of g++ abc.cpp
, it will open “abc.cpp” and compile it. g++
is a program like any program you can write: It reads files, performs computations, and writes files.
g++
may be a single executable that does many things, but it likely performs much of its job by creating additional subprocesses to execute other programs. There may be a separate program to do the actual compiling of the code and another to link the code into an executable. (There can also be separate programs for preprocessing and optimization, as well as for compiling the code into an intermediate language, then for generating assembly language, then for assembling the assembly language, but these may also be integrated into one program.)
There are many interactions with the system in this process. Opening files, and reading from and writing to them, requires system calls. Your question seems largely to ask about executing programs.
Roughly, on Unix and similar systems, the steps involved in one program causing another to be executed are:
- The program calls
fork
. This is a system call that creates a duplicate of the process. When it is done, there are two copies of the same process. One is called the parent and one is called the child. The system tells each process whether it is the parent or the child by a return value from the fork
call.
- The program examines the
fork
return value to see whether it is the parent or the child. If it is the child, it calls a routine in the exec
family to execute another program.
- The
exec
call opens the file containing the program to be executed and reads it into memory. This is involves interpreting the contents of the executable file, because the executable file is not just raw data. It contains a variety of structures that describe different things to be put into memory when preparing to run the program.
- Much of the work of the
exec
call can be done in ordinary ways: Opening a file, reading its contents, analyzing its contents, and arranging things in memory. Additionally, executable files may use shared libraries, and loading those will require opening more files and loading them. However, the exec
call will be assisted in some degree with system calls that change memory mappings for the process and perform other tasks.
- Ultimately, when the program to be executed is sufficiently loaded into memory, the software that is loading it will transfer control to its start address, and then the process is running the new program.
(I have probably given short shrift to the exec
and loading processes, and possibly other issues touched on above.)