add your own instructions using pin

Question

Is it possible to add your own code in the code generated by intel-pin?

I was wondering this for a while, I created a simple tool:

#include <fstream>
#include <iostream>
#include "pin.H"

// Additional library calls go here

/*********************/

// Output file object
ofstream OutFile;

//static uint64_t counter = 0;

uint32_t lock = 0;
uint32_t unlock = 1;
std::string rtin = "";
// Make this lock if you want to print from _start
uint32_t key = unlock;

void printmaindisas(uint64_t addr, std::string disassins)
{
    std::stringstream tempstream;
    tempstream << std::hex << addr;
    std::string address = tempstream.str();
    if (key)
        return;
    if (addr > 0x700000000000)
        return;
    std::cout<<address<<"\t"<<disassins<<std::endl;
}

void mutex_lock()
{

key = !lock;
std::cout<<"out\n";

}
void mutex_unlock()
{

    key = lock;
    std::cout<<"in\n";

}

void Instruction(INS ins, VOID *v)
{
    //if
  // Insert a call to docount before every instruction, no arguments are passed
  INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printmaindisas, IARG_ADDRINT, INS_Address(ins),
  IARG_PTR, new string(INS_Disassemble(ins)), IARG_END);
    //std::cout<<INS_Disassemble(ins)<<std::endl;
}

void Routine(RTN rtn, VOID *V)
{
    if (RTN_Name(rtn) == "main")
    {
        //std::cout<<"Loading: "<<RTN_Name(rtn) << endl;
        RTN_Open(rtn);
        RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)mutex_unlock, IARG_END);
        RTN_InsertCall(rtn, IPOINT_AFTER, (AFUNPTR)mutex_lock, IARG_END);
        RTN_Close(rtn);
    }
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", "o", "mytool.out", "specify output file name");
/*
VOID Fini(INT32 code, VOID *v)
{
    // Write to a file since cout and cerr maybe closed by the application
    OutFile.setf(ios::showbase);
    OutFile << "Count " << count << endl;
    OutFile.close();
}
*/

int32_t Usage()
{
  cerr << "This is my custom tool" << endl;
  cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
  return -1;
}

int main(int argc, char * argv[])
{
  // It must be called for image instrumentation
  // Initialize the symbol table
  PIN_InitSymbols();

  // Initialize pin
  if (PIN_Init(argc, argv)) return Usage();
  // Open the output file to write
  OutFile.open(KnobOutputFile.Value().c_str());

  // Set instruction format as intel
    // Not needed because my machine is intel
  //PIN_SetSyntaxIntel();

  RTN_AddInstrumentFunction(Routine, 0);
  //IMG_AddInstrumentFunction(Image, 0);

  // Add an isntruction instrumentation
  INS_AddInstrumentFunction(Instruction, 0);

  //PIN_AddFiniFunction(Fini, 0);

  // Start the program here
  PIN_StartProgram();

  return 0;

}

If I print the following c code (which does literally nothing):

int main(void)
{}

Gives me this output:

in
400496  push rbp
400497  mov rbp, rsp
40049a  mov eax, 0x0
40049f  pop rbp
out

And with the following code:

#include <stdio.h>
int main(void)
{
  printf("%s\n", "Hello");
}

prints:

in
4004e6  push rbp
4004e7  mov rbp, rsp
4004ea  mov edi, 0x400580
4004ef  call 0x4003f0
4003f0  jmp qword ptr [rip+0x200c22]
4003f6  push 0x0
4003fb  jmp 0x4003e0
4003e0  push qword ptr [rip+0x200c22]
4003e6  jmp qword ptr [rip+0x200c24]
Hello
4004f4  mov eax, 0x0
4004f9  pop rbp
out

So, my question is, is it possible to add:

4004ea  mov edi, 0x400580
4004ef  call 0x4003f0
4003f0  jmp qword ptr [rip+0x200c22]
4003f6  push 0x0
4003fb  jmp 0x4003e0
4003e0  push qword ptr [rip+0x200c22]
4003e6  jmp qword ptr [rip+0x200c24]

instructions in my first code (code with no print function), using pin in the instrumentation routine/ or analysis routine, so that I can imitate the my second code (by dynamically adding those instructions)? (I don't want to call printf directly, but want to imitate the behavior) (in future I was thinking of imitating sanity checker or intel mpx using pin, if I could add these check instructions dynamically in some way)

I looked at pin documentation, it has the instruction modification api, but it can be only used to add direct/ indirect branches or delete instructions (but we can't add add new ones).

Pin analysis routines are instructions added by you to the code. You could insert assembly routines, but I'm guessing what you really want to do is perform instructions in application context instead of tool context? — nitzanms, Mar 06 '19 at 21:31
Ok. First, thanks for ur reply. GCC provides a lot of security measures (say address sanitizer, mmpx), some are hardware supported and some are only software based. These checks can be added during compilation, if you have the source code. Let's say I don't have the source code, only the source binary. My plan is to add those security measures using pin, dynamically, during the execution of binary, so i not sure that would be in application or tool context. — R4444, Mar 06 '19 at 21:54
You need Application context. However, the tools you mention require compile time information that is not available in the compiled result. — nitzanms, Mar 09 '19 at 07:42
correct. I was thinking of manipulating (or compromising, so the implementation is not necessarily full proof) that information or something like: if the individual array bounds are not known - then assume whole stack as their bounds, etc. Could you give me any suggestion on how this information can be added? I would appreciate any api functions/ methods, which would help me in this quest. p.s.: I tried to use asm() function to add inline assembly before, but I don't think that can be used. — R4444, Mar 09 '19 at 15:38

Hadi Brais · Accepted Answer · 2019-03-28T04:56:07.460

An analysis routine (or replacement routine) is really just code inserted into the application being profiled. But it appears to me that you want to modify one or more registers of the application context. By default, when an analysis routine executes, the Pin runtime saves the application context on entrance to the analysis routine and then later restores it when the routine returns. This basically allows the analysis routine to execute without any unintended changes to the application. However, Pin provides three ways to modify the application context in an analysis or replacement routine:

Pass the IARG_RETURN_REGS argument to the routine. The value returned from the routine is stored into the specified register of the application context. This enables you to change any single register whose size does not exceed the size of ADDRINT, which is the return value type of the routine. This is not supported in Probe mode or with the Buffering API¹. However, it is the most efficient way to change a single register.
Pass an IARG_REG_REFERENCE argument for each register you want to modify in the routine. For each such argument, you need to add a parameter in the declaration of the routine of type PIN_REGISTER*. This is not supported in Probe mode or with the Buffering API, but it is the most efficient way to change a couple of registers and supports all registers.
Pass the IARG_CONTEXT argument to the routine. You need to add a parameter in the declaration of the routine of type CONTEXT*. Use the context manipulation API to change one or more registers of the application context. For example, you can change the RIP register of the application context using PIN_SetContextReg(ctxt, REG_INST_PTR, NewRipValue). In order for the context changes to take effect, PIN_ExecuteAt must be called, which resumes the execution of the application at the potentially changed RIP with the specified context. This is not supported with the Buffering API and there are restrictions in the Probe mode.

For example, you if you want to execute mov edi, 0x400580 in the application context, you can simply store the value 0x400580 in the EDI register of the application context in your analysis routine:

r->dword[0] = 0x400580;
r->dword[1] = 0x0;      // See: https://stackoverflow.com/questions/11177137/why-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6

where r is of type PIN_REGISTER*. Or alternatively:

PIN_SetContextReg(ctxt, REG_EDI, 0x400580); // https://stackoverflow.com/questions/38782709/what-is-the-default-type-of-integral-literals-represented-in-hex-or-octal-in-c

Later when application execution resumes, RDI will contain 0x400580.

Note that you can change any valid memory location in your analysis routine whether it belongs to the application or your Pin tool. For example, if the RAX register of the application context contains a pointer, you can directly access the memory location at that pointer just like any other pointer.

Footnotes:

(1) It seems you're not using the Probe mode or the Buffering API.

I don't know PIN, but wouldn't you need to explicitly zero the upper 32 bits after `r->dword[0] = 0x400580;` sets the low dword, if you want to actually get the effect of `mov edi, imm32`? — Peter Cordes, Mar 28 '19 at 04:37

add your own instructions using pin

1 Answers1