3

I have a question about putting data (address table or other data) in the .text section under its function or put in .data section? For example, I have a function like this :

extern int i0();
extern int i1();
extern int i2();
extern int i3();
extern int i4();
extern int i5();

void fff(int x) {
 switch (x) {
     case 0:
     i0();
     break;
     case 1:
     i1();
     break;
     case 2:
     i2();
     break;
     case 3:
     i3();
     break;
     case 4:
     i4();
     break;
     case 5:
     i5();
     break;
 }
}

here in assembly, this is my code:

fff:
        cmp     edi, 5
        ja      .L10
        mov     edi, edi
        xor     eax, eax
        jmp     [QWORD PTR .L4[0+rdi*8]]
.L4:
        .quad   .L9
        .quad   .L8
        .quad   .L7
        .quad   .L6
        .quad   .L5
        .quad   .L3
.L5:
        jmp     i4
.L3:
        jmp     i5
.L9:
        jmp     i0
.L8:
        jmp     i1
.L7:
        jmp     i2
.L6:
        jmp     i3
.L10:
        ret

Here I have .L4 which holds the jump addresses ... where should I put this .L4 table ? Under the fff function or I have to put it in the .data section ? What about static data ? For example, I have 2 QWORD for a function, I must put it in that function, or I must put those QWORDs in the data section ? Why ? I know that there will be no problem if I put it in .data section or under its function, but I want to know which way is better?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
HelloMachine
  • 355
  • 2
  • 8

2 Answers2

4

The .data section is usually writable, and you would not want your jump table to be accidentally or maliciously overwritten. So .data isn't the best place for it.

.text would be fine; it is normally read-only. It doesn't really matter whether it's near the function or not. Many systems have a .rodata section which is read-only and not executable, which would be even better; it would help catch bugs or attacks which accidentally or deliberately try to execute the bytes of the jump table.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • So there is no reason (for example, large cache-line (in fact i don't know about cache-line ... i just heard it) or ...) for not putting data-table in the .text section, near the function ... right ? Because i always thought that .text section must not be huge ... – HelloMachine Oct 29 '21 at 18:49
  • 2
    Most (all?) x86 CPUs have separate L1 instruction and data caches, so being near the function isn't relevant for L1 cache locality. Conceivably it could help with L2/L3, but that is large enough that if your function is called with any frequency, the jump table should stay hot anyway. – Nate Eldredge Oct 29 '21 at 18:52
  • 1
    @HelloMachine You don't want to put writable data next to functions as that can cause caching issues. Read-only data is fine, but you get better cache utilisation if you put all data next to each other (e.g. into a dedicated `.rodata` section) instead of mixing data and code. – fuz Oct 30 '21 at 09:06
  • @HelloMachine: [Why do Compilers put data inside .text(code) section of the PE and ELF files and how does the CPU distinguish between data and code?](https://stackoverflow.com/q/55607052) debunks that false premise; only obfuscators mix code and data. With separate L1d / L1i caches, and split iTLB/dTLB, it wastes space / coverage in caches. In the cold case it might get your data into L2 along with code fetch, speeding up the eventual L1d miss, but that sacrifices a whole page of dTLB coverage for those few bytes. – Peter Cordes Oct 30 '21 at 09:54
  • @HelloMachine: If you're optimizing, instead of `.L5: jmp i4`, just put `i4` as the jmp table entry instead of `.L5`, so you're directly tailcalling. – Peter Cordes Oct 30 '21 at 09:54
  • @HelloMachine: If you *were* going to mix data and code, you'd still want to the data after the `ret` in this function so it's out-of-line. The default target address prediction for an indirect jmp (when no dynamic prediction can be found in the BTB) is `+0`, the next instruction, so you want to make that the most likely case, not data! (Of course in your case, all the targets should be other functions, not jmp instructions to other functions, so you could put a `ud2` there to stop speculative exec along that useless path, or just the `ret`.) – Peter Cordes Oct 30 '21 at 10:00
  • If the data and code are in cache line? When the data is loaded presumably the line is already in L1 icache. Does the line need to get duplicated in L1D (If so is it directly transferred from L1 Icache or from shared L2?) or can the data be read from Icache directly? – Noah Nov 01 '21 at 22:41
3

Yes, you can put the table of pointers (.L4:) in .text section (if it won't be modified at run time) but I don't see a reason for double indirection to a set of jumps to external functions i0..i5. You can branch with an indirect near jump, which takes the destination address from a table of pointers to those external functions. The linker takes care of the completion of external addresses. Example in NASM/Intel syntax:

|                            |     global fff
|                            |     extern i0,i1,i2,i3,i4,i5
|00000000:4883FF05           |fff: cmp rdi, 5
|00000004:773A               |     ja  .L10
|00000006:FF24FD[10000000]   |     jmp [.L4+8*rdi]
|0000000D:0F1F00             |     align 8  ; For better performance.
|00000010:[0000000000000000] |.L4: dq i0
|00000018:[0000000000000000] |     dq i1
|00000020:[0000000000000000] |     dq i2
|00000028:[0000000000000000] |     dq i3
|00000030:[0000000000000000] |     dq i4
|00000038:[0000000000000000] |     dq i5
|00000040:C3                 |.L10:ret
vitsoft
  • 5,515
  • 1
  • 18
  • 31
  • Thank you ...... – HelloMachine Oct 29 '21 at 18:59
  • Your jump table solution doesn't work, because `jmp [.L4+rdi+4*rdi]` is an indirect jump through memory, so it treats the jmp instructions as the destination addresses. Even if it did work, it still has two jmp instructions. Instead, use a jump table like in the question, but put the actual destination in the table instead of a local label that jumps to the destination. In other words, change `jmp` in the jump table to `.quad`. – prl Oct 30 '21 at 03:00
  • @prl Oops, thats true, thank you, fixed. – vitsoft Oct 30 '21 at 07:52