I use asmjit in my c++ code and defined a function like below:
// parse asm_str to byte code, return the length of byte code
int assemble(bool isx64, unsigned long long addr, const char* asm_str, char buffer[MAX_INSTRUCTION_LENGTH])
{
// for test, I modified param's value
isx64 = true;
addr = 0x6a9ec0;
asm_str = "call 0x00007FFF1CF8CEE0";
auto arch = isx64 ? Arch::kX64 : Arch::kX86;
// Initialize Environment with the requested architecture.
Environment environment;
environment.setArch(arch);
// Initialize CodeHolder.
CodeHolder code;
Error err = code.init(environment, addr);
if (err) {
dbg_print_err("code.init failed, reason:%s", DebugUtils::errorAsString(err));
return 0;
}
x86::Assembler a(&code);
err = AsmParser(&a).parse(asm_str, strlen(asm_str));
if (err) {
dbg_print_err("AsmParser(&a).parse failed, asm_str=\"%s\" addr=0x%llx reason:%s", asm_str, addr, DebugUtils::errorAsString(err));
return 0;
}
else {
CodeBuffer& buf = code.sectionById(0)->buffer();
memcpy(buffer, buf.data(), buf.size());
print_byte_hex(buffer, buf.size());
return (int)buf.size();
}
}
When I run this funciton and got the result of buffer is 40 E8 00 00 00 00
and not find any error. Actually, I known about that this instruction could not compile to byte code in addr(0x6a9ec0
). So, I want to know how to determine if such instructions are compiled successfully in the code.
How to determine if such instructions are compiled with errors in the byte code.
I have more of these types of questions:
Following @Petr 's introduction, I have generated the following instruction bytecode:
FF 15 02 00 00 00 CC CC E0 CE F8 1C FF 7F 00 00
The result of disassembling this bytecode at this address using X64dbg is as follows:
00006a9ec0| FF15 02000000 | call qword ptr ds:[0x6a9ec8]
00006a9ec6| cc | int 3
00006a9ec7| cc | int 3
00006a9ec8| E0CEF81CFF7F0000 | dq 7FFF1CF8CEE0
However, doing so still does not solve the problem, because after the instruction completes this call, the bytecode at address 0x6a9ec6 will be considered as instruction to execut, which clearly does not comply with the logic of the program.
After searching for relevant information, I found that using bytecode directly to encode can obtain the correct logic. The specific method is to convert this call instruction into the following instructions:
7FFB694C0960 | FF15 04000000 | call qword ptr ds:[7FFB694C096a]
7FFB694C0966 | EB 0a | jmp 7FFB694C0972
7FFB694C0968 | 48 a1 0102030405060708 | mov rax, storage address for calls
7FFB694C0972 | 90 | nop
As a beginner, there are still many similar problems that need to be solved. These issues all occur after moving a 64 bit instruction to another address, such as:
lea rax, ds:[rip+0x9DCAA]
mov rax,ds:[rip+0x100]
Near jump within a function
Far jump
Loop instruction
...
This has brought many difficulties to my assembly learning. My current solution is to replace these instructions accordingly without changing the original logic.
For example:
0x00007fff1f618d5f cmp byte ptr ds:[rip+0x1637C6], 0x0
will be replaced with:
push rax
mov rax,0x7fff1f77c52c
cmp byte ptr ds:[rax], 0x0
pop rax
Note: In here, rip+0x1637C6=0x7fff1f77c52c
I don't know if there are any side effects to doing this, and is there a better solution when using powerful ASmJIT?