Assembly long jump performance

Question

in Assembly, i have a JUMP table with about 2000 JUMP points (labels) and each label has about 20-30 lines of Assembly instruction ... so yes ... it's a big (switch) with big body of source code ... for example:

.TABLE:
     DD .case0
     DD .case1
     DD .case2
     DD .case3
     DD .case4
     ...
     ...
     ...
     DD .case2000

and each case has some codes like this (each Case has about 20-30 lines of Instructions)

.case0:
    push   ...
    mov    ...
    push   ...
    mov    ...
    push   ...
    mov    ...
    ...

    jmp    [4 * eax + .TABLE]  ;; it's may be the 'case1000' (eax is our case)

.case1:
    push   ...
    mov    ...
    push   ...
    mov    ...
    push   ...
    mov    ...
    ...

    jmp    [4 * eax + .TABLE]  ;; it's may be the 'case1000' (eax is our case)

.case2:
    push   ...
    mov    ...
    push   ...
    mov    ...
    push   ...
    mov    ...
    ...

    jmp    [4 * eax + .TABLE]  ;; it's may be the 'case1000' (eax is our case)
...
...
...

i may need to jump to each label about 10 times and in a moment i may be at case2 and next time i may be at case1020 (long jump)

Now my question about this Jumps ... for example, if i want to jump from case2 to case1020, is there any performance problem or this long jump is Exaclty same as for example 'case0' to 'case100' ??

my opinion : jmp just change index and no matter (no diffrent in performance with less than 127 byte jump) even if we jump over 150Kb instructions ... true ?

Also i know we have short jump and near jump and far jump ... but i think short jump is just about instruction Code size (127) and there is no diffrents in performance between jumping 127 byte or jumping 100000000 bytes ....

is it true ?

The performance (or more precisely, latency) of a jump instruction mostly depends on how well the place the jump goes to as well as the jump table are cached. Usually, there is no need to worry about this. — fuz, Nov 26 '19 at 15:23
in this case, all cases are not related to each other and it's may jump to case0 and from case0, jump to case2000 !!! so we can't say anything about place of cases. — ELHASKSERVERS, Nov 26 '19 at 15:56
You may find a performance benefit (or handicap) for keeping `.TABLE` in a register instead of using an immediate. As this appears to always be a dynamic branch, the earlier you can compute the effective address relative to the indirect branch, the better. — Erik Eidt, Nov 26 '19 at 16:27
You mean something like this ? (mov ecx, .TABLE jmp [4 * eax + ecx]) ? — ELHASKSERVERS, Nov 26 '19 at 16:38
Here's some related optimization tips for this sort of code: https://stackoverflow.com/questions/46321531/x86-prefetching-optimizations-computed-goto-threaded-code/46323255#46323255 — Ross Ridge, Nov 26 '19 at 17:57
I started to write an answer, but most of what I would write is in the answer linked by @Ross. One thing missing there is the effect of the TLB. Jumps within a page will hit the TLB, of course, whereas jumps to another page may miss, introducing additional latency. — prl, Nov 26 '19 at 21:17

Assembly long jump performance

0 Answers0