Note that I have already read the question
Why not mark everything inline?
and
Can a recursive function be inline?
yet still feel that there is a unresolved edge case of interest here. Assume a language has the following:
- pure first-class functions whose parameters are treated as constants (no write-back)
- anonymous inline functions are supported (lambda abstraction)
- callbacks/continuations/coroutines/fibers as patterns
- tail-call optimisation
- macro expansion
then why bother allocating activation records at runtime when you could perform an inline expansion of all (mutually) recursive function definitions at compile time? This would seem to reduce the calling overhead to zero, as well as open up opportunities for parallel 'simplification' of the expressions abstracted behind each functions using standard computer algebra techniques where variables can remain unknowns during symbolic reduction - including evaluating multiple branches of a condition speculatively and then throwing away all but the valid result (something that would be made difficult by the von Neumann bottleneck imposed by a stack-based approach).
I appreciate that it would not be an optimisation to expand in-place all recursive function invocations as the code-bloat would not be able to take proper advantage of the CPU's cache, however I am only interested in using this for the features mentioned in 3. which I feel are pragmatic as they are of limited depth.