I'm trying to measure the impact on performance caused by replacing a script's C code with assembly on key sections of it, and the resulting execution time is about 40% of the original when ran from a TimingSimpleCPU
on SE mode.
So I'm wondering if this change is to be expected, or if it could be caused by the CPU choice or some other factor such as the compiler's optimization level.
I know the question is a bit broad, but i just need a general idea of where to start looking for an explanation.