Generally a function call has some overhead before anything inside the function is done. The code generated for a function call basically ensures that you'll find everything like you left it when you, well, return; while it gives you at the same time a clean empty environment inside the called function. In fact this convenience is one of the most crucial services C provides, next to the standard library. (In many other respects C is a mere macro assembler -- did you ever look at a C source and the generated assembler side by side?).
In particular usually a few registers must be saved, and possibly parameters must be copied on the call stack. The effort required depends on the processor, compiler and calling convention. For example, parameters and return values may be in registers, not on the stack (but then the parameters must be saved anyway for each recursive call, don't they?).
The overhead is relatively large if the function is small; that's why inlining can be powerful. Inlining recursive function calls is similar to loop unrolling. I don't know whether current compilers do that on a regular basis (they might). But it's risky to rely on the compiler, so I would avoid recursive implementations of trivial functions, like computing the factorial, if speed is important.