3

I am currently using Pin and I want to get the value that a store instruction is writing. The problem that I am facing is that even though I can insert a callback before the write instruction (using IPOINT_BEFORE) and get a value from the memory address that will be written, it obviously isn't the correct one since the writing hasn't happened yet. I cannot use IARG_MEMORYWRITE_EA and IPOINT_AFTER as arguments together.

I have managed to make it work when there is a load instruction, since the value is already in memory. The code for that is below.

void Read(THREADID tid, ADDRINT addr, ADDRINT inst){

  PIN_GetLock(&globalLock, 1);

  ADDRINT * addr_ptr = (ADDRINT*)addr;
  ADDRINT value;
  PIN_SafeCopy(&value, addr_ptr, sizeof(ADDRINT));

  fprintf(stderr,"Read: ADDR, VAL: %lx, %lu\n", addr, value);

  .
  .
  .

  PIN_ReleaseLock(&globalLock);
}

VOID instrumentTrace(TRACE trace, VOID *v)
{

  for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl)) {
    for (INS ins = BBL_InsHead(bbl); INS_Valid(ins); ins = INS_Next(ins)) {  
      if(INS_IsMemoryRead(ins)) {
      INS_InsertCall(ins, 
             IPOINT_BEFORE, 
             (AFUNPTR)Read, 
             IARG_THREAD_ID,
             IARG_MEMORYREAD_EA,
             IARG_INST_PTR,
             IARG_END);
      } else if(INS_IsMemoryWrite(ins)) {
      INS_InsertCall(ins, 
             IPOINT_BEFORE, 
             (AFUNPTR)Write, 
             IARG_THREAD_ID,//thread id
             IARG_MEMORYWRITE_EA,//address being accessed
             IARG_INST_PTR,//instruction address of write
             IARG_END);
      }
    }
  }
}

How can I grab the value that a store instruction writes to memory?

vic
  • 359
  • 4
  • 18
  • In multi-threaded code, the value you read from the memory location at one time isn't necessarily the same value that shows up in a register when you let the instruction actually execute. Of course, when the instruction isn't a simple `mov` load or store of a register, the load/store data never appears in an architectural register. e.g. `add [rsi], eax` stores the add result (after loading and producing it in a hidden internal temporary). – Peter Cordes Mar 07 '18 at 02:19
  • What I want to do is to maintain a virtual cache. I am already using a [Cache Simulator](https://github.com/blucia0a/MultiCacheSim) that keeps track of the tags and the coherence state of all lines. But I have to actually populate that virtual cache with the values that the instrumented program uses. For reads, I can already do that. Do you have any suggestion about how to get the value that a store instruction will write? I don't necessarily need to get it from memory after it is written, I assume. If there is a way to get the data that a store instruction will write is fine for me. – vic Mar 07 '18 at 02:24
  • IDK, I haven't used PIN at all. But are you sure you need to simulate valid data for your cache? If you just want to simulate cache hits/misses, you don't need to track the data contents at all, just the tag / MESIF states of each line. Unless you're trying to simulate [silent store optimizations](https://stackoverflow.com/questions/47417481/what-specifically-marks-an-x86-cache-line-as-dirty-any-write-or-is-an-explici) or something else that produces data-dependent cache dirtying or invalidation. – Peter Cordes Mar 07 '18 at 04:23
  • Anyway, what do you want to be able to do with this "virtual cache" you're maintaining? If you do need the data, different use-cases might or might not care about race conditions between reading the real load/store data vs. memory contents before / after. – Peter Cordes Mar 07 '18 at 04:27
  • 1
    I need the data for some crosschecking between lines at the Invalid state and the correct ones that will be brought by the coherence protocol. I tried catching the register values of the write instructions, but then again not all instructions use registers. Some of them have immediate values. – vic Mar 07 '18 at 21:31

1 Answers1

2

I think I managed to do what I was trying to. The way I get the values is that every time there is a store in the program, I save the memory address that it will write to. Then I instrument every single instruction and call the WriteData function, which essentially gets the data from the memory address that i previously saved, just like with Reads.

This is the code for getting the value of a load instruction.

void Read(THREADID tid, ADDRINT addr, ADDRINT inst){

  PIN_GetLock(&globalLock, 1);

  ADDRINT * addr_ptr = (ADDRINT*)addr;
  ADDRINT value;
  PIN_SafeCopy(&value, addr_ptr, sizeof(ADDRINT));

  fprintf(stderr,"Read: ADDR, VAL: %lx, %lx\n", addr, value);    
  ...          
  PIN_ReleaseLock(&globalLock);
}

This is the code for grabbing the address of store instruction.

void Write(THREADID tid, ADDRINT addr, ADDRINT inst ){    

  PIN_GetLock(&globalLock, 1); 

  writeaddr = addr;
  writecount++;    
  ...    
  PIN_ReleaseLock(&globalLock);
}

This is the code for getting the data from the address of the previous store.

void WriteData(){ 

  PIN_GetLock(&globalLock, 1);

  //Reading from memory      
  if (writecount > 0){

    ADDRINT * addr_ptr = (ADDRINT*)writeaddr;
    ADDRINT value;
    PIN_SafeCopy(&value, addr_ptr, sizeof(ADDRINT));

    fprintf(stderr,"Write: ADDR, Value: %lx, %lx\n", writeaddr, value);  

    writecount--;
  }

  PIN_ReleaseLock(&globalLock);

}

But a minor problem remains. The following is the data from the microbenchmark that I use and after that are the printouts in the terminal.

for (i = 0; i < MAX; i++) {
        a[i] = i;
  }

  for (i = 0; i < MAX; i++) {
        a[i] = a[i] + 1;
        b[i] = a[i];
  }

MAX is 5.

Write: ADDR, Value: 601078, 6f
Read: ADDR, VAL: 7ffd0560de10, 40051b
Write: ADDR, Value: 601080, 0
Write: ADDR, Value: 601084, 1
Write: ADDR, Value: 601088, 2
Write: ADDR, Value: 60108c, 3
Write: ADDR, Value: 601090, 4
Read: ADDR, VAL: 601080, 100000000
Write: ADDR, Value: 601080, 100000001
Write: ADDR, Value: 601060, 1
Read: ADDR, VAL: 601084, 200000001
Write: ADDR, Value: 601084, 200000002
Write: ADDR, Value: 601064, 2
Read: ADDR, VAL: 601088, 300000002
Write: ADDR, Value: 601088, 300000003
Write: ADDR, Value: 601068, 3
Read: ADDR, VAL: 60108c, 400000003
Write: ADDR, Value: 60108c, 400000004
Write: ADDR, Value: 60106c, 4
Read: ADDR, VAL: 601090, 4
Write: ADDR, Value: 601090, 5
Write: ADDR, Value: 601070, 5

From what we see in the terminal, it seems that the first writes to a[i], happen as expected. But then, when the program is reading the same addresses instead of getting 1,2,etc, it gets 100000001 and so on. It correctly increments them by 1. But when the time comes to store them to b[i], the values are again correct. So I am wondering why I encounter this behaviour with the data I get from reads.

vic
  • 359
  • 4
  • 18
  • where do you call the WriteData function? When I try to call it at IPOINT_AFTER, it gives me an error indicating that it `cannot insert IPOINT_AFTER on an instruction without a fall-through path` . – Faridzs Mar 15 '18 at 14:02
  • I have another if statement in order to check if an instruction is either a read from or write to memory instruction. That's when I call WriteData. – vic Mar 16 '18 at 20:51
  • There's another problem. There are situations in which `IPOINT_BEFORE` of many Write instructions are called before their `IPOINT_AFTER` is called and the `writecount` variable may become more than one. In such scenarios, `writeaddr` will be overwritten by the new address. – Faridzs Mar 18 '18 at 08:18