1

By using the objdump command I figured that the address 0x02a8 in memory contains start the path /lib64/ld-linux-x86-64.so.2, and this path ends with a 0x00 byte, due to the C standard.

So I tried to write a simple C program that will print this line (I used a sample from the book "RE for beginners" by Denis Yurichev - page 24):

#include <stdio.h>

int main(){
    printf(0x02a8);
    return 0;
}

But I was disappointed to get a segmentation fault instead of the expected /lib64/ld-linux-x86-64.so.2 output.

I find it strange to use such a "fast" call of printf without specifiers or at least pointer cast, so I tried to make the code more natural:

#include <stdio.h>

int main(){
    char *p = (char*)0x02a8;
    printf(p);
    printf("\n");
    return 0;
}

And after running this I still got a segmentation fault.

I don't believe this is happening because of restricted memory areas, because in the book it all goes well at the 1st try. I am not sure, maybe there is something more that wasn't mentioned in that book.

So need some clear explanation of why the segmentation faults keep happening every time I try running the program.

I'm using the latest fully-upgraded Kali Linux release.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Deno
  • 57
  • 5
  • You cannot use printf without a format string, how would the system know what you're trying to print ? – dspr Oct 29 '20 at 19:54
  • 2
    `char *p = (char*)0x02a8;` 0x02a8 is a virtual address in your process, if it exists at all. It has nothing to do with "memory cells". – stark Oct 29 '20 at 19:54
  • @dspr it should actually use `puts` instead of `printf` but ... yeah. – Antti Haapala -- Слава Україні Oct 29 '20 at 19:54
  • @dspr oh yes, you can - no problem - just pass a char pointer (to a zero terminated char array). Perfectly valid. – Support Ukraine Oct 29 '20 at 19:56
  • The "address" you found out, is it an offset into the executable file, or is it really something that's supposed to be loaded into memory at that address? Please show us what evidence you have that this string should be at this specific address. How do you even know that the string will actually be loaded into memory? – Some programmer dude Oct 29 '20 at 20:02
  • @4386427 Ok but some people argue it's unsafe and, therefore, not recommended, e.g. : https://stackoverflow.com/questions/31290850/why-is-printf-with-a-single-argument-without-conversion-specifiers-deprecated – dspr Oct 29 '20 at 20:11
  • @dspr unsafe is different from "can't be done". And... if the string doesn't come from "the outer world", it's not even unsafe. – Support Ukraine Oct 29 '20 at 20:14
  • @dspr you can https://godbolt.org/z/Gffbfo – 0___________ Oct 29 '20 at 20:34
  • @P__J__ @4386427 : You're right, I should have said "you should not" instead "you cannot". One could find `puts` clearer to do that kind of things because the "f" of printf means format (I guess)... but it's questionable. Anyway thanks for the clarification ! – dspr Oct 29 '20 at 21:37

3 Answers3

5

Disappointing to see that your "RE for beginners" book does not go into the basics first, and spits out this nonsense. Nonetheless, what you are doing is obviously wrong, let me explain why.

Normally on Linux, GCC produces ELF executables that are position independent. This is done for security purposes. When the program is run, the operating system is able to place it anywhere in memory (at any address), and the program will work just fine. This technique is called Address Space Layout Randomization, and is a feature of the operating system that nowdays is enabled by default.

Normally, an ELF program would have a "base address", and would be loaded exactly at that address in order to work. However, in case of a position independent ELF, the "base address" is set to 0x0, and the operating system and the interpreter decide where to put the program at runtime.

When using objdump on a position independent executable, every address that you see is not a real address, but rather, an offset from the base of the program (that will only be known at runtime). Therefore it is only possible to know the position of such a string (or any other variable) at runtime.

If you want the above to work, you will have to compile an ELF that is not position independent. You can do so like this:

gcc -no-pie -fno-pie prog.c -o prog
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • After compiling that way, my ```objdump``` shows up brand new ```0x4002a8``` address with same inerpreter path, that looks pretty much real version of what ```0x02a8``` was covering inside. But still segfaults with both of my codes. Is it still fake real-looking one? – Deno Nov 01 '20 at 07:52
  • Whoop.. Fixed myself. All I suppose to do is to compile "print path" program same way as ```objdump``` parameter program . With ```-no-pie``` – Deno Nov 01 '20 at 08:34
1

It no longer works like that. The 64-bit Linux executables that you're likely using are position-independent and they're loaded into memory at an arbitrary address. In that case ELF file does not contain any fixed base address.

While you could make a position-dependent executable as instructed by Marco Bonelli it is not how things work for arbitrary executables on modern 64-bit linuxen, so it is more worthwhile to learn to do this with position-independent ones, but it is a bit trickier.

This worked for me to print ELF i.e. the elf header magic, and the interpreter string. This is dirty in that it probably only works for a small executable anyway.

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

int main(){
    // convert main to uintptr_t
    uintptr_t main_addr = (uintptr_t)main;

    // clear bottom 12 bits so that it points to the beginning of page
    main_addr &= ~0xFFFLLU;

    // subtract one page so that we're in the elf headers...
    main_addr -= 0x1000;

    // elf magic
    puts((char *)main_addr);

    // interpreter string, offset from hexdump!
    puts((char *)main_addr + 0x318);
}

There is another trick to find the beginning of the ELF executable in memory: the so-called auxiliary vector and getauxval:

The getauxval() function retrieves values from the auxiliary vector, a mechanism that the kernel's ELF binary loader uses to pass certain information to user space when a program is executed.

The location of the ELF program headers in memory will be

#include <sys/auxv.h>
char *program_headers = (char*)getauxval(AT_PHDR);

The actual ELF header is 64 bytes long, and the program headers start at byte 64 so if you subtract 64 from this you will get a pointer to the magic string again, therefore our code can be simplified to

#include <stdio.h>
#include <inttypes.h>
#include <sys/auxv.h>


int main(){
    char *elf_header = (char *)getauxval(AT_PHDR) - 0x40;
    puts(elf_header + 0x318); // or whatever the offset was in your executable
}

And finally, an executable that figures out the interpreter position from the ELF headers alone, provided that you've got a 64-bit ELF, magic numbers from Wikipedia...

#include <stdio.h>
#include <inttypes.h>
#include <sys/auxv.h>


int main() {
    // get pointer to the first program header
    char *ph = (char *)getauxval(AT_PHDR);

    // elf header at this position
    char *elfh = ph - 0x40;

    // segment type 0x3 is the interpreter;
    // program header item length 0x38 in 64-bit executables
    while (*(uint32_t *)ph != 3) ph += 0x38;

    // the offset is 64 bits at 0x8 from the beginning of the 
    // executable
    uint64_t offset = *(uint64_t *)(ph + 0x8);

    // print the interpreter path...
    puts(elfh + offset);
}
-1

I guess it segfaults because of the way you use printf: you dont use the format parameter how it is designed to be.

When you want to use the printf function to read data the first argument it takes is a string that will format how the display will work int printf(char *fmt , ...) "the ... represent the data you want to display accordingly to the format string parameter

so if you want to print a string //format as text

  printf("%s\n", pointer_to_beginning_of_string);

// If this does not work cause it probably will it is because you are trying to read memory that you are not supposed to access.

try adding extra flags " -Werror -Wextra -Wall -pedantic " with your compiler and show us the errors please.

fratardi
  • 73
  • 3