Function calls are cheap in C++ compared to most other languages for one reason: C++ is built upon the concept of function inlining, whereas (for example) java is built upon the concept of everything-is-a-virtual-function.
In C++, most of the time you're calling a function, you're not actually generating an call
instruction. Especially when calling small or template functions, the compiler will most likely inline the code. In such case the function call overhead is simply zero.
Even when the function is not inlined, the compiler can make assumptions about what the function does For example: the windows X64 calling convention specifies that the registers R12-R15, XMM6-XMM15 should be saved by the caller. When calling a function, the compiler must generate code at the call site to save and restore these registers. But if the compiler can prove that the registers R12-R15, XMM6-XMM15 are not used by the called function such code can be omitted. This optimization is much harder when calling a virtual function.
Sometimes inlining is not possible. Common reasons include the function body not being available at compile time, of the function being too large. In that case the compiler generates an direct call
instruction. However because the call target is fixed, the CPU can prefetch the instructions quite well. Although direct function calls are fast, there is still some overhead because the caller needs to save some registers on the stack, increase the stack pointer, etc.
Finally, when using an java function call or C++ function with the virtual
keyword, the CPU will execute an virtual call
instruction. The difference with an direct call is that the target is not fixed, but instead stored in in memory. The target function may change during the runtime of the program, which means that the CPU cannot always prefetch the data at the function location. Modern CPU's and JIT compilers have various tricks up their sleeve to predict the location of the target function, but it is still not as fast as direct calls.
tldr: function calls in C++ are fast because C++ implements inlining and by default uses direct calls over virtual calls. Many other languages do not implement inlining as well as C++ does and utilize virtual functions by default.