-1

I've been working on an extra credit problem I came up with, but I'm having trouble figuring out how to access a 2D int array correctly in x86 assembly.

I know a "2d array" is really a 1d array in C with offsets based on the [i][j] indexes, but I can't figure out a correct way of accessing them.

So basically if I have the base pointer of the array stored in eax, what would be the correct way to offset the register address to access the array at say, [3][2] ? An equation or assembly instructions would do.

too honest for this site
  • 12,050
  • 4
  • 30
  • 52
Riker
  • 27
  • 2
  • 7
  • Wait, are you looking for `array[row + col*max_row]`? If so, what is wrong with just using `array[row][col]`? See [here](http://stackoverflow.com/questions/2565039/how-are-multi-dimensional-arrays-formatted-in-memory). – Fantastic Mr Fox Dec 14 '15 at 19:26
  • i'm looking for how to correctly access them in memory based on the fact that they're 16bit integers. – Riker Dec 14 '15 at 19:29
  • 1
    @Ben: Because he wishes to code that it in assembly? – Ctx Dec 14 '15 at 19:29
  • so there would be an offset based on their bit size as well since I'm trying to get to them through some command in x86 assembly – Riker Dec 14 '15 at 19:31
  • " know a "2d array" is really a 1d array in C" You know wrong. A 2D array in C is a 2D array. The same for a 5D array, etc. – too honest for this site Dec 14 '15 at 19:54
  • no Olaf, a 2D array in C is really a 1D array. If I have arr[3][4], it creates a an array of {{0000},{0000},{0000}}, which is the equivalent of a 1D array {0000 0000 0000} – Riker Dec 14 '15 at 20:09

1 Answers1

1

If I have the base pointer of the array stored in eax, what would be the correct way to access the array at [3][2] ?

First, you need to know whether it's a C multidimensional array, or whether it's an array of pointers (to arrays).

int foo (int cols, int multidimensional[][cols])
{ // C99-only, not C++
    return multidimensional[3][2]; // load rdi + 3*cols + 2
}
// or
int bar (int *pointers_to_rows[]) {
    return pointers_to_rows[3][2];  // load rdi + 3, then load that +2
}

gcc accepts this with -std=c99 -Wall -pedantic, so I think it's valid C99, not a GNU extension. Anyway, you can't try it on godbolt because C++ doesn't have C99's variable-dimension array types, and godbolt only has c++ compilers, not C.


I guess you mean proper multidimensional arrays, since you're talking about them actually being 1D arrays. This is incorrect as far as C's type rules are concerned, but correct in terms of how they're actually implemented and stored.

Anyway, array[row][col] is syntactic sugar for array[row*max_col + col]. (@Ben's comment may be talking about Fortran, not C: C stores arrays in Row major order). max_col isn't stored in memory anywhere, only as part of the array's type. (And C is statically typed, with no reflection, so this info is only present in debug symbols in the binary). That's why my example function requires it as a function param. Your question is unanswerable, because you're asking how to do it without the array dimension(s).

gcc --std=gnu99 /tmp/foo.c -O3 -masm=intel -S -o- compiles that C to:

;; comments added manually.  -fverbose-asm isn't *this* helpful :P
foo:  ; (int cols, int multidimensional[][cols])
        movsx   rdi, edi              ; the ABI doesn't require clearing upper bits when passing values that don't fill registers
        lea     rax, [rdi+rdi*2]      ; rax = cols*3
        mov     eax, DWORD PTR [rsi+8+rax*4]  ; return rsi[cols*3*sizeof(int) + 2*sizeof(int)]
        ret

bar:  ; (int *pointers_to_rows[])
        mov     rax, QWORD PTR [rdi+24]  ; 24 = 3 * sizeof(pointer)
        mov     eax, DWORD PTR [rax+8]   ; 8  = 2*sizeof(int)
        ret

As usual, the easiest way to see how something is done is to see what a compiler does.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847