2

I wrote a simple code trying to find out if we can read and print the memory in code segment:

#include <stdio.h>

void main() {
  int *code_ptr = 0x4;
  printf("code_ptr = %x\n", code_ptr);
  printf("*code_ptr = %x\n", *code_ptr);
}

My system is x86_64 + Ubuntu 19.04 (Disco Dingo). And the program failed with the following output:

code_ptr = 4
Segmentation fault (core dumped)

IIUC, in Linux, the code segment and data segment share the same base address. And if that's true, this program will read the memory in code segement, and I was expecting that there won't be any crash since 0x04 should be in the range of data segment (which starts at the beginning). And this should pass the paging check since the mapped memory for the code segment is read-only and we only read the memory here.

So did I miss anything or is there any other mechanisms that prevent us from reading from this %ds:0x4?

Changda Li
  • 123
  • 1
  • 8
  • 2
    Why do you assume that address 0x00000004 belongs to your process? Most probably, it won't. – mcleod_ideafix Jun 08 '20 at 02:38
  • 1
    I think this 0x4 is a logical address, not a physical address. And for a 64 bit system, the logical addresses from 0 to 2^64 -1 all belong to the current process. And this logical address will first be translated to linear address with `%ds:0x4`, which I think is 0x4. And then by referring to the page table, it will finally be mapped to a physical address. – Changda Li Jun 08 '20 at 02:41
  • 1
    *"what prevents us from reading the memory from code segment?"* -- Nothing, assuming you know where it is. Your program is getting a seg fault because the virtual memory at 0x4 is simply not mapped to a valid physical address. Same thing, i.e. a seg fault, would happen with an invalid data section address. Try obtaining a linker map to get an idea where the various sections are. See https://stackoverflow.com/questions/38961649/gcc-how-to-create-a-mapfile-of-the-object-file – sawdust Jun 08 '20 at 03:40
  • Why not try the GNU debugger **gdb** on your program to see what memory locations the instructions occupy? If **gdb** can *disassemble* the machine instructions, then that memory must be readable. – sawdust Jun 08 '20 at 06:33
  • 1
    The first page of virtual memory addresses from 0 is usually left unmapped in order to catch null pointer dereferencing errors. – Ian Abbott Jun 08 '20 at 11:09

1 Answers1

3

I think your key misunderstanding is that you're assuming the 8086 hardware feature called "the data segment" is the same as the executable image subdivision also called "the data segment." Xenix may have used that hardware feature that way, but no modern x86 Unix does. On a modern Unix, %ds:0 always points to linear address zero, not to the beginning of the executable's data segment. (And similarly %cs:0 points to linear address zero, not to the executable's text segment.)

All of an executable's segments will be loaded into linear address space somewhere well above linear address 0, and on current-generation OSes the load addresses will be randomized on each run.

There's no standard way to get a pointer to the beginning of the executable's code or data segment. On GNU systems you can use dl_iterate_phdr, and other OSes may have similar functionality under a different name.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • what about printing the address of `main()` ? – Yvain Jun 08 '20 at 03:14
  • 1
    @Yvain The address of `main` will be inside the executable's text segment, but probably not at the beginning of it. – zwol Jun 08 '20 at 03:28
  • @zwol *"... inside the executable's text segment"* -- Don't you mean the text *section*? – sawdust Jun 09 '20 at 00:22
  • @sawdust No, I mean segment. ELF has both; to first order, sections are used in unlinked object files, segments are used in executables and shared objects. – zwol Jun 09 '20 at 00:52
  • Thanks @zwol! Yea, my main misunderstanding is that I'm assuming the program starts from linear address 0. And I've tried and confirmed that the program will start from different linear addresses upon each execution. – Changda Li Jun 09 '20 at 01:20