0

I know .int and .long occupy 4 bytes, and I should use movl. But what is the like 0x20, 0x30, 0x40, 0x2c? It's dirty data?

.section .data
iary:
    .int 1, 2
lary:
    .long 3, 4

.section .text
.globl  _start

_start:
    movq $0, %rdi
    movq iary(, %rdi, 4), %rcx   # $rcx = 0x200000001

    movq $1, %rdi
    movq iary(, %rdi, 4), %rcx   # $rcx = 0x300000002

    #===============================
    
    movq $0, %rdi 
    movq lary(, %rdi, 4), %rcx   # $rcx = 0x400000003

    movq $1, %rdi
    movq lary(, %rdi, 4), %rcx   # $rcx = 0x2c00000004

    movq $60, %rax
    syscall
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
OnlyWick
  • 342
  • 2
  • 10
  • 2
    That's not weird, that's just loading two dwords into a qword register. `.int` and `.long` are both 4 byte integers in GAS for x86, regardless of the C type width in the x86-64 SysV ABI. The `0x2c` at the bottom of the high dword is garbage from whatever's next after .data. – Peter Cordes Jun 27 '22 at 07:30
  • @PeterCordes Okay, I got it. `0x2c` makes me doubt! – OnlyWick Jun 27 '22 at 07:36
  • 2
    Look at memory with a debugger to see what garbage is there if you run off the end of the array. (And maybe use `readelf -a` and/or a hexdump on your executable to see how the file is laid out, if you're curious how it got there. But really, you should expect random garbage anywhere off the end of a section, until you get to the end of a page and segfault.) – Peter Cordes Jun 27 '22 at 07:38
  • @PeterCordes about file laid out, I should refer to `Object Files` in the `x86-64 SysV ABI`? I've heard `ELF` and `COFF`, I don't know amd64 supports which one, or supports both... Maybe I need to google.. – OnlyWick Jun 27 '22 at 07:54
  • 2
    The ABI doc doesn't nail down the linker's choices for what order to put things in. I meant look at the actual file. It's not necessary to know a huge amount about object files to write assembly and understand compiler-generated assembly while tuning programs for performance. This is *only* something you'd want to look into if you're curious about exactly where that byte is coming from. – Peter Cordes Jun 27 '22 at 07:58
  • @PeterCordes Thanks, I did it what you said, `0x2c` beyond data section, and get a value from `__bss_start`. – OnlyWick Jun 27 '22 at 08:52
  • That's weird, your program doesn't have a BSS section. Did you link it with libc instead of making a static executable with just that asm (`ld foo.o -o foo`)? But yes, if there is a BSS, it normally starts right after `.data`. (Part of the same ELF segment, with "memsiz" greater than "filesiz" in the ELF program headers if you look at `readelf -a`.) And if you linked libc, it will have some global variable which I think end up in the main executable's BSS, like `FILE *stdout`, which get written by libc startup code which runs before your _start from dynamic linker hooks. – Peter Cordes Jun 27 '22 at 08:55
  • @PeterCordes I just use `ld foo.o -o foo`, I think I may need to post a question again..This is my first time to use `readelf`, so I don't know anything specific fields. – OnlyWick Jun 27 '22 at 11:31
  • 1
    Oh, hmm, maybe `ld` puts a `__bss_start` label at the end of `.data` even if there isn't actually a BSS. And the data there comes from whatever's next in the file, pulled in as part of this private read/write mapping of the file. (And it's not zeroed by the kernel because it's *not* a BSS.) Yeah, there's a `__bss_start` symbol even if there's no `.data` section either. – Peter Cordes Jun 27 '22 at 12:10
  • @PeterCordes I don't understand what you said, I haven't learned os. I'd better learn assembly first. Thanks for your reply! LoL – OnlyWick Jun 27 '22 at 12:20
  • @PeterCordes I used `hexdump -C` to examine `elf` file, I saw many section divided by `*`. Is it meet the `elf file layout`? – OnlyWick Jun 28 '22 at 03:07
  • 1
    `hexdump` uses `*` to indicate repeats (of zeros). It's normally to have zero padding for alignment of sections. See [Minimal executable size now 10x larger after linking than 2 years ago, for tiny programs?](https://stackoverflow.com/q/65037919) – Peter Cordes Jun 28 '22 at 03:08
  • @PeterCordes Thanks, the second question is `hexdump -C` shows me many section, Are these sections meet the `ELF` file layout, like `ELF Header, and Program header, Section header...`? – OnlyWick Jun 28 '22 at 03:24
  • `hexdump` doesn't know about ELF files specifically, it's literally just the raw bytes. Use `readelf -a` to parse the ELF headers and find interesting file offsets to look at, e.g. your `.data` section. I don't know the details of what ELF file layout looks like in terms of raw bytes; it's never been relevant. I know that code which reads off the end of a section will sometimes see some non-zero bytes from the file mapping that includes the section you wanted, but I've rarely been curious enough about what exact thing the linker chose to put next, or where the section headers are. – Peter Cordes Jun 28 '22 at 03:32

0 Answers0