I've implemented a little bytecode interpreter using computed goto (see here if not familiar).
It would seem that perhaps it's possible to do simple JITting by copying the memory between labels, thus optimizing out the jumps. For example, say I've got the following in my interpreter:
op_inc: val++; DISPATCH();
I would change this to:
op_inc: val++;
op_inc_end:
When JITting, I would append memory between the labels to my output:
memcpy(jit_code+offset, &&op_inc, &&op_inc_end - &&op_inc);
(jit_code
is marked executable using mmap
)
Finally, I would use computed goto
to jump to the beginning of the copied machine code:
goto *(void*)jit_code
Will this work? Is there something missing in my mental model of machine code that would thwart this idea?
Let's assume that code and data share the same address space. Let's also assume PIC.
Update
Looking at the example in linked article, after removing DISPATCH
, we have:
do_inc:
val++;
do_dec:
val--;
do_mul2:
val *= 2;
do_div2:
val /= 2;
do_add7:
val += 7;
do_neg:
val = -val;
do_halt:
return val;
The generated code for do_inc
(no optimization) is simply:
Ltmp0: ## Block address taken
## %bb.1:
movl -20(%rbp), %eax
addl $1, %eax
movl %eax, -20(%rbp)
(followed directly by do_dec
). It looks like this little snippet could be clipped out.