0

I've been trying to understand how my assembly code is being loaded in RAM,so I ran this code on a x86_64 cpu:

section .text
global main
main:
    push 50
    push 64
    push 0x10
    mov eax, 60
    syscall

after looking at how the first 4 lines are loaded in RAM using gdb I got the result:

0x401000 <main>:    0x406a326a  0x3cb8106a

since my CPU is little endian, it's natural for the lines to be flipped(hence why line 2 comes before line 1) so from that push would be 0x6a and mov would be 0xb8(not sure).

My question is, why are they bundled up into 32 bit words? I thought the output would be something like:

0x3cb8106a 0x406a326a   

as it would make sense in a 64 bit little endian machine.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    You asked GDB to display memory contents in (little-endian) 32-bit chunks. That is all. `push imm8` is a 2-byte instruction; `mov eax, imm32` is a 5-byte instruction. x86 machine code uses variable-length instructions from 1 to 15 bytes with no alignment or bundling – Peter Cordes Mar 11 '20 at 05:03
  • Ohh I see I thought x outputs the values as they are in ram, how do I get gdb to print out in 64 bit chunks? –  Mar 11 '20 at 05:05
  • 1
    It does do that, with your choice of output format. The default is 32-bit chunks which GDB calls "words". Intel manuals call that size "double word" or dword, but it has nothing to do with how the CPU would decode. Use `help x`. Or use `disas /r` to disassemble and show machine code. (You probably also want `set disassembly-flavor intel` - GNU `.intel_syntax noprefix` is different from NASM, but closer than AT&T syntax.) – Peter Cordes Mar 11 '20 at 05:06
  • I see, let me check! Thank you I was starting to get very confused because of this! –  Mar 11 '20 at 05:06
  • An other thing, does the "word" option default to 32 bits regardless of the cpu word size? cause it's doing that, to get 64 bits I have to use g –  Mar 11 '20 at 05:08
  • 2
    Yes, GDB has its own names for widths. Outside of GDB, most 64-bit ISAs (like MIPS64, or AArch64) define a "word" as 32 bits, and 64 bits as a double-word. But x86-64 evolved out of a 16-bit ISA, not 32, so on x86-64 a "word" = 16 bit, a dword = 32-bit, and a qword = 64-bit. Again, that's unrelated to GDB's size options. But your idea that a "64-bit ISA" should mean word = 64-bit is totally wrong even outside of GDB not caring about the ISA's sizes. See [What's the size of a QWORD on a 64-bit machine?](https://stackoverflow.com/q/55430725) – Peter Cordes Mar 11 '20 at 05:11
  • Oh also, GDB has `x /i` to dump memory as instructions. – Peter Cordes Mar 11 '20 at 05:30
  • Yeah I saw it, I wanted to see how the raw bits are saved though! thanks again –  Mar 11 '20 at 05:32
  • 1
    Yeah, a disassembly that includes machine-code hexdump shows you that most clearly. The hardware only cares about cache line (64-byte) boundaries as far as fetching instructions. Or maybe some effect from 16-byte aligned fetch blocks. x86 is not a "word-oriented" ISA; at the other end of the spectrum MIPS for example is very strongly word-oriented, with fixed-width 32-bit aligned instruction words that (originally) barely needed decoding and could just be used directly as internal control signals. related: [Instruction Lengths](https://stackoverflow.com/q/4567903) – Peter Cordes Mar 11 '20 at 05:36
  • I see, so my whole idea of the CPU is fetching the code is wrong! But with this kind of fetching, does the program counter only go through the cache and when it for example hits a boundary the cpu fetches the next 64 bytes? –  Mar 11 '20 at 05:47
  • 2
    For that level of detail, read [Agner Fog's microarch pdf](https://www.agner.org/optimize/) - it's very good, and aimed at software people that want to know how hardware runs their code. Also https://www.realworldtech.com/sandy-bridge/ explains fetch from L1i cache to feed pre-decode and decode stages on Sandybridge-family CPUs. But *logically* all of this is giving the illusion of fetching / decoding / executing single instructions one at a time. Out-of-order exec always preserves that illusion for a single thread. – Peter Cordes Mar 11 '20 at 05:50
  • Thank you, I'll check those out for sure, this is honestly very fascinating! –  Mar 11 '20 at 06:15
  • Came across [How does CPU perform operation that manipulate data that's less than a word size](https://stackoverflow.com/q/56436206) while looking for something else; the answer I wrote there a year ago is basically the same as what I said in comments yesterday. :P – Peter Cordes Mar 11 '20 at 17:50

0 Answers0