I am using PIN to analyze a C program's instructions and perform necessary operations. I have compiled my C program using GCC on Ubuntu and then passed the generated executable as input to the pintool. I have a pintool which calls an instruction instrumentation routine and then calls an analysis routine everytime. This is my Pintool in C++ -
#include "pin.H"
#include <fstream>
#include <cstdint>
UINT64 icount = 0;
using namespace std;
KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", "o", "test.out","A pin tool");
FILE * trace;
//====================================================================
// Analysis Routines
//====================================================================
VOID dump(VOID *ip, UINT32 size) {
unsigned int i;
UINT8 opcodeBytes[15];
UINT32 fetched = PIN_SafeCopy(&opcodeBytes[0], ip, size);
if (fetched != size) {
fprintf(trace, "*** error fetching instruction at address 0x%lx",(unsigned long)ip);
return;
}
fprintf(trace, "\n");
fprintf(trace, "\n%d\n",size);
for (i=0; i<size; i++)
fprintf(trace, " %02x", opcodeBytes[i]); //print the opcode bytes
fflush(trace);
}
//====================================================================
// Instrumentation Routines
//====================================================================
VOID Instruction(INS ins, void *v) {
INS_InsertCall( ins, IPOINT_BEFORE, (AFUNPTR)dump, IARG_INST_PTR, IARG_UINT32, INS_Size(ins) , IARG_END);
}
VOID Fini(INT32 code, VOID *v) {
printf("count = %ld\n",(long)icount);
}
INT32 Usage(VOID) {
PIN_ERROR("This Pintool failed\n"
+ KNOB_BASE::StringKnobSummary() + "\n");
return -1;
}
int main(int argc, char *argv[])
{
trace = fopen("test.out", "w");
if (PIN_Init(argc, argv)) return Usage();
PIN_InitSymbols();
PIN_AddInternalExceptionHandler(ExceptionHandler,NULL);
INS_AddInstrumentFunction(Instruction, 0);
PIN_AddFiniFunction(Fini, 0);
// Never returns
PIN_StartProgram();
return 0;
}
When I check my output trace I see that I get an output like this-
3
48 89 e7
5
e8 78 0d 00 00
1
55
The first row is the size in bytes of the instruction and the second row is the opcode stored in each byte.
I saw this particular forum- https://groups.yahoo.com/neo/groups/pinheads/conversations/topics/4405#
where they mentioned that the Linux output is inconsistent and is due to a 32 bit disassembler for 64 bit instructions. I am getting the same output as the Linux ones mentioned here, while the Windows ones are the correct x86_64 opcodes I am expecting.
Any idea how I can get the correct opcodes and if I am doing the dissassembly wrong, how I can correct it. I am using a 64-bit PC so don't know if I am doing 32-bit disassembly.