1

given the data:

.section data
data_set:
.long 2,5,33,54,2,76,4,37,43,223,98,70,255

how do I push the start address of the data (and not the value in that address) to the stack?

I tried this:

pushl data_set

which eventually (after trying to access the data in this address) resulted in a segfault.

talz
  • 1,004
  • 9
  • 22

1 Answers1

7

In AT&T syntax, to use an address as an immediate operand to an instruction, use $label.

You want pushl $data_set for the push imm32 instruction., like you would push $123.

pushl data_set would push a dword of data loaded from the address data_set, i.e.
push m32.


Conceptually, AT&T syntax treats everything as a potential address. So label is an address, and so is 0x123. So add 0x123, %eax loads from the address 0x123, and add label, %eax loads from the address label. When you want a number as an immediate operand, you use add $0x123, %eax. Using a symbol address as an immediate operand works identically.

Either way, the address is encoded into the instruction, it's just a question of whether it's as an immediate or as a displacement in an addressing mode. This is why you use
add $(foo - bar), %eax to add the distance between two symbols, instead of
add $foo-$bar, %eax (which would look for a symbol called $bar, i.e. the $ is part of the symbol name). The $ applies to the whole operand, not to symbol / label names. Related: more about assemble-time math in GAS

In other contexts, e.g. as an operand to .long or .quad, there's no immediate vs. memory operand issue, so you just write dataptr: .long data_set to emit 4 bytes of data holding the address of data_set, like you'd get from C static int *dataptr = data_set;


You could have checked on the syntax by looking at C compiler output for

void ext(int*);
static int data[] = {1,2,3};
void foo() {
    ext(data);
}

to get the C compiler to emit code passing a symbol address to a function. I put this on the Godbolt compiler explorer, where it does indeed use pushl $data.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 3
    Technically `label` is always an address. The `$` just selects an instruction encoding with an immediate. E.g. if you wanted an immediate for the difference of two labels `foo` and `bar`, you'd do `$foo-bar` not `$foo-$bar` (which also happens to assemble without error, but references the symbols `foo` and `$bar`). – Jester Jan 14 '18 at 13:31
  • 1
    @Jester: Right thanks, rewrote my answer to get the concepts correct. `$` applies to the whole operand, not the label name. I'd figured this out not long ago, but apparently still hadn't grokked it. – Peter Cordes Jan 14 '18 at 16:05