During my journey of assembly language, I came across this strange behavior involving mov
instruction.
This unintended behavior is that the address I intended to load gets changed to the equivalent instruction with a different address.
Thanks to the tricks suggested by the fellow StackOverflow users (Calling a function through its address in memory in c / c++ | Using __builtin_extract_return_addr() function to find the RSP value of ret instruction)
I was able to create a simple code (this is pseudocode) to perform a test on loading / comparing addresses:
typedef void function(void);
uint64_t *sp;
asm ("movq %%rsp, %0\n"
: "=r" (sp) : );//:
uint64_t *ret_addr;
ret_addr = __builtin_extract_return_addr((void *) *((long *)sp) + 1);
if (ret_addr == 0x40ac45)
{
printf("WHY:\n");
}
function* test_addr = (function*)0x40ac6b;
asm goto (
"cmp %0, %1\n"
"jne %l[L2]\n"
: // output operands
: "r"(ret_addr), "r"(test_addr) // input operands
:
: L2);
L1: int3();
L2: no_problem();
To summarize, I am obtaining the return address (ret_addr
) of the instruction. Then if that address is 0x40ac45, the program will output "WHY:", and then compare that address with the test_addr
(0x40ac6b). If these addresses are not equal, then goto no_problem
function, otherwise, I will execute int3
function to interrupt.
As shown here in more detail, the bug is that although the RET Addr (0x40ac45) != TEST Addr (0x40ac6b), the program executes trace trap which should only happen when they are equal.
To debug this, I have added the following code to load the ret_addr (which should be 0x40ac45) into one of the empty registers (%%r14):
asm volatile (
"mov %0, %%r14\n\t" : : "r"(ret_addr)
);
Upon running the GDB, I found that in the %%r14, instead of the intended address of 0x40ac45, 0x40ac6b is loaded as you can see here.
Although all of the previous sanity checks have been passed to show that the return address is 0x40ac45, for some reason when I use mov
instruction, it loads 0x40ac6b instead.
I did an additional search of whether these two addresses have something in common, and I found out the following upon disassembly:
000000000040ac30 <close_stdout>:
----------------
40ac45: 85 c0 test %eax,%eax
----------------
40ac6b: 85 c0 test %eax,%eax
They are the same instruction just loaded in the different address.
Which finally leads me to these questions:
- What could be the reason why this is happening? Is it due to these two addresses have equivalent instructions?
- Is the
mov
instruction appropriate for this kind of use? I tried usinglea
instruction, but unfortunately, this did not work ("lea (%%r14), %0"
) as the r14 register did not obtain the address value. - What other sanity checks I can do to verify that I'm loading the correct address value? My code seems to work for all of the other instructions, just that this particular one is giving me trouble.
- Is there any way to "force" load the absolute address without resorting to the hard coding the address value? (e.g.,
function* ret_addr = (function*)0x40ac45;
)
I apologize for the lengthy explanation (I tried to make it short as possible) and thank you for any kind of suggestions.
Kind regards,