Simpler than writing a compiler and VM is to registerize and trampoline your interpreter. Since you have an interpreter and not a compiler (I assume), you only need a couple straightforward transformations to get proper support for tail calls.
You'll have to first write everything in continuation-passing style, which may be weird to think about and do in C/C++. Dan Friedman's ParentheC tutorial steps you through transforming a high-level, recursive program into a form that is machine-translatable to C.
In the end, you'll essentially implement a simple VM where instead of using regular function calls to do eval, applyProc, etc., you pass arguments by setting global variables and then do a goto
to the next argument (or use a top-level loop and program counter)...
return applyProc(rator, rand)
becomes
reg_rator = rator
reg_rand = rand
reg_pc = applyProc
return
That is, all of your functions that normally call each other recursively are reduced to a pseudo-assembly in which they are just blocks of code that don't recur. An top-level loop controls the program:
for(;;) {
switch(reg_pc) {
case EVAL:
eval();
break;
case APPLY_PROC:
applyProc();
break;
...
}
}
Edit: I went through the same process for my hobby Scheme interpreter, written in JavaScript. I took advantage of a lot of anonymous procedures, but it might help as a concrete reference. Look at FoxScheme's commit history starting from 2011-03-13 (30707a0432563ce1632a) up through 2011-03-15 (5dd3b521dac582507086).
Edit^2: Non-tail recursion will still consume memory, even if it's not in the stack.