-1

This is a example c code, compiled with gcc in mac os
What does the hex in the right mean

this is the code

int main(int argc, char *argv[])
{
    char key[] = "key-123";
    if(argc == 2){
        printf("Checking key ... %s\n", argv[1]);
        if (strcmp(argv[1], key) == 0) {
            printf("Access Granted");
        } else {
            printf("Denied");
        }
    }
    else{
        printf("insert the key");
    }
    return 0;
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 4
    It's better to not post images; people can't copy-paste from them. Please include the assembly snippet as text in your question instead. You can use the edit button under your question. – Nate Eldredge Jun 02 '21 at 16:28

1 Answers1

3

It's the value using for the move to %rax. It's calculated as an offset from %rip which is known so the disassembler can show you that value. The main subtlety here is that the %rip contains the address of the following instruction. So you have %rip = 0x100003e7f, the instruction adds 0x181 so you get 0x100003e7f + 0x181 = 0x100004000 which is the value you see. This is the address of the key string that will be moved onto the stack by the next 2 instructions

EDIT: I was not quite sure how this was generated so I assumed from the asm that it was loading the key. But Peter mentioned it was on MacOS. So it's the stack protector canary (it seems). That does not change though that the comment is the value moved to %rax

Guillaume
  • 2,044
  • 12
  • 11
  • Any idea why GCC for MacOS (so probably actually clang) would only be loading a pointer from static storage, instead of the actual ASCII data directly? Compiling for Linux (also with optimization disabled) https://godbolt.org/z/8os435j7T - we do see it also decide to copy from `.L__const.main.key` vs. actual GCC using `mov rax, 0x33...` ASCII data as an immediate. But even with clang, we don't see an extra level of indirection through the GOT or anything to reach non-variable data, even with `-fPIC` – Peter Cordes Jun 02 '21 at 20:50
  • But anyway, note it's *loading* a pointer to the actual ASCII data it wants. `0x100004000` isn't the address of the initializer string, it's the address of a pointer to it. – Peter Cordes Jun 02 '21 at 20:51
  • No, it's not the stack canary / cookie value. GCC/clang -fstack-protector-strong would be using thread-local storage for the stack cookie, like `mov %fs:40, %rax` / `mov %rax, -8(%rbp)`. Totally separate from the initializer for `key[]`, which is what this is. I'm pretty sure MacOS uses FS as the TLS base register, like Linux (both using the x86-64 System V ABI), so https://godbolt.org/z/4f8K8afWs is a lot like what you'd see for MacOS. Notice that there's an extra deref of RAX, but the stack-protector cookie is just a value that gets loaded. – Peter Cordes Jun 02 '21 at 22:33
  • There is indeed an extra deref. That just means it's loading a pointer and deferencing it before writing it onto the stack. I don't think that tells you anything esp for unoptimized code. Your godbolt output looks different from what I am seeing on macOS so I also don't think this tells you anything. The asm generated by `clang -S` on macOS is `movq ___stack_chk_guard@GOTPCREL(%rip), %rax` for the first mov instruction so I do think this is the canary. – Guillaume Jun 03 '21 at 04:56
  • Ok, that does fully make sense of things, even though it's not what I was expecting. In the Godbolt link in my previous comment, the initializer for `key[]` is copied *after* spilling the stack args. Which makes sense: the asm for that statement isn't part of the prologue. But in the question, the load + deref + store happens before that. So that's pretty strong evidence it's not the initializer. (And introducing extra indirection for the stack cookie makes more sense than for the contents of a string literal.) – Peter Cordes Jun 03 '21 at 05:04
  • [gcc canaries : undefined reference to \_\_stack\_chk\_guard](https://stackoverflow.com/q/27290086) sheds some light on the var name: for GCC, it did used to use a `___stack_chk_guard` global variable (even on GNU/Linux) instead of TLS. I guess clang on MacOS is still using that. I was wondering if `___stack_chk_guard` was maybe `-fstack-check` (stack clash protection for huge VLAs / alloc) instead of `-fstack-protector` (buffer overflows), but it does seem it's part of the stack-protector mechanism. – Peter Cordes Jun 03 '21 at 05:07
  • Wrt to your question about why loading a static pointer on macOS vs simply writing the ASCII data directly onto the stack, it seems that it's a Mach-O thing. Mach-O expects all constant strings to be in some specific subsection (__cstring). I think the idea is to keep the data and code strictly separated. But I could not find a good reference on this. – Guillaume Jun 03 '21 at 13:48
  • It's not a MacOS thing; my first Godbolt link showed that clang targeting Linux chose to copy from `.rodata` instead of using `mov $0x..., %rax` / `mov %rax, -??(%rbp)` the way GCC does. Of course if the compiler chooses to *load* constant data at all, it's going to do it from another section on any OS (because compilers [don't mix code and data on x86 for performance reasons](https://stackoverflow.com/a/55609077), as well as non-exec RO data), but there's no way an ABI can sensibly forbid `movq $0, -??(%rbp)` to zero-init 8 bytes, for example, `char key[8] = "";` just like `long key = 0;`. – Peter Cordes Jun 03 '21 at 18:23
  • Always copying for non-empty strings seems to be a clang12 missed optimization: https://godbolt.org/z/f4WPojEKb shows clang 5.0 -O3 using mov-immediate, just like it would if the variable was a `long`. (That does make larger machine code, but clang 12 copies even for shorter non-0 strings like `"key"`, where it's clearly more efficient to use one mov-immediate to memory. Or even better a `push`-immediate, but that's a separate missed optimization.) – Peter Cordes Jun 03 '21 at 18:24
  • Yes, you are right. Nothing to do with Mach-O. – Guillaume Jun 03 '21 at 18:33