2

I am using the Intel PIN tool to do some analysis on the assembly instructions of a C program. I have a simple C program which prints "Hello World", which I have compiled and generated an executable. I have the assembly instruction trace generated from gdb like this-

Dump of assembler code for function main:
   0x0000000000400526 <+0>:     push   %rbp
   0x0000000000400527 <+1>:     mov    %rsp,%rbp
=> 0x000000000040052a <+4>:     mov    $0x4005c4,%edi
   0x000000000040052f <+9>:     mov    $0x0,%eax
   0x0000000000400534 <+14>:    callq  0x400400 <printf@plt>
   0x0000000000400539 <+19>:    mov    $0x0,%eax
   0x000000000040053e <+24>:    pop    %rbp
   0x000000000040053f <+25>:    retq   
End of assembler dump.

I ran a pintool where I gave the executable as an input, and I am doing an instruction trace and printing the number of instructions. I wish to trace the instructions which are from my C program and probably get the machine opcodes and do some kind of analysis. I am using a C++ PIN tool to count the number of instructions-

#include "pin.H"
#include <iostream>
#include <stdio.h>

UINT64 icount = 0;
using namespace std;

//====================================================================
// Analysis Routines
//====================================================================

void docount(THREADID tid) {
    icount++;
}

//====================================================================
// Instrumentation Routines
//====================================================================

VOID Instruction(INS ins, void *v) {
    INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_THREAD_ID, IARG_END);
}

VOID Fini(INT32 code, VOID *v) {
    printf("count = %ld\n",(long)icount);
}

INT32 Usage() {
    PIN_ERROR("This Pintool failed\n"
              + KNOB_BASE::StringKnobSummary() + "\n");
    return -1;
}

int main(int argc, char *argv[]) {
    if (PIN_Init(argc, argv)) return Usage();

    PIN_InitSymbols();
    PIN_AddInternalExceptionHandler(ExceptionHandler,NULL);
    INS_AddInstrumentFunction(Instruction, 0);
    PIN_AddFiniFunction(Fini, 0);
    PIN_StartProgram();

    return 0;
}

When I run my hello world program with this tool, I get icount = 81563. I understand that PIN adds its own instructions for analysis, but I don't understand how it adds so many instructions, while I don't have more than 10 instructions in my C program. Also is there a way to identify the assembly instructions which are from my code and the ones generated by PIN. I seem to find no way to differentiate between instructions generated by PIN and the ones which are from my program. Please Help!

Rohit Poduri
  • 99
  • 2
  • 9
  • 2
    I am not familiar with PIN but presumably it's also counting the instructions in the C library. – Jester Oct 06 '17 at 00:34
  • Try making a static executable that just makes an `exit` system call directly. (e.g. take out the loop from my [mov elimination microbenchmark](https://stackoverflow.com/questions/44169342/can-x86s-mov-really-be-free-why-cant-i-reproduce-this-at-all/44193770#44193770)). – Peter Cordes Oct 06 '17 at 00:51
  • @PeterCordes Whatever be the contents of my C program, icount always is above 80k. I don't know if there is a way to differentiate between the machine instructions and the ones generated by PIN. – Rohit Poduri Oct 06 '17 at 01:45
  • The code for `printf` isn't generated by PIN, it's instructions that your program runs on its own. So is the CRT start and exit code. So try PIN on a program that just exits right away without the CRT and without calling any library functions. – Peter Cordes Oct 06 '17 at 01:48
  • @PeterCordes, I still get a similar number as before. I suspect it is due to some libraries of Pin – Rohit Poduri Oct 06 '17 at 06:13

1 Answers1

1

You're not measuring what you think you're measuring. See my answer here for details: What instructions 'instCount' Pin tool counts?

Pin does not count its own instructions. The large count is the result of preparation before and after main() and the call to printf().

nitzanms
  • 1,786
  • 12
  • 35
  • I [suggested](https://stackoverflow.com/questions/46596570/tracking-native-instructions-in-intel-pin#comment80146865_46596570) the OP should try making a static executable that just makes a `sys_exit` system call from `_start`. The OP says they got a "similar number of instructions", but your answer indicates they probably did it wrong. – Peter Cordes Oct 08 '17 at 20:35
  • I certainly suspect that this is the case. I haven't done it for a while but I remember that making such an executable is a little tricky. – nitzanms Oct 09 '17 at 11:03
  • 1
    Not if you do it all the time for microbenchmarking anyway :P See https://stackoverflow.com/questions/44169342/can-x86s-mov-really-be-free-why-cant-i-reproduce-this-at-all/44193770#44193770 (and take out the loop, leaving just `_start: xor edi,edi` / `mov eax, 231` / `syscall`.) See https://stackoverflow.com/questions/36861903/assembling-32-bit-binaries-on-a-64-bit-system-gnu-toolchain/36901649 for how to assemble + link NASM source into a static executable (no libc). – Peter Cordes Oct 09 '17 at 11:13