I want to write a function computeWriteSet
that takes an arbitrary function f
as an argument and (1) executes the function f
and (2) returns the set of places modified or written to (addresses/pages/objects) during f
's execution.
writeset computeWriteSet(function f) {
writeset ws = createEmptyWriteset();
// prepare to execute f
startRecordingWrites(&ws);
f();
stopRecordingWrites(&ws);
// post-process the write-set
return ws;
}
- What options exist for implementing it?
- What are their tradeoffs (in which case which implementation is more efficient and what are the limitations?)
Notes
The function is specified at runtime and can do anything (i.e. can contain any set of instructions, including loops, branching and function/system calls.
All writes from the time f
is called until it returns should be recorded (this includes functions called from within f
itself). For simplicity, let's assume computeWriteSet
is not called from within.
OS-specific tricks are allowed (and probably required). I'm particularly interested in Linux, ideally within userspace.
Example
static int x = 0;
static int y = 0;
static int z = 0;
void a() {
if (y) z++;
if (x) y++;
x = (x + 1) % 2;
}
int main() {
computeWriteSet(a); // returns { &x } => {x,y,z} = {1, 0, 0}
computeWriteSet(a); // returns { &x, &y } => {x,y,z} = {0, 1, 0}
computeWriteSet(a); // returns { &x, &z } => {x,y,z} = {1, 1, 1}
return 0;
}
Expected Output
The output should be the set of changes. This can be either the set of pages:
{ <address of x>, <address of y>, …}
Or the set of memory addresses:
{<page of x and y>, <page of z>, …}
Or the set of objects ( (based on interposition of allocation functions)
x = malloc(100) // returns address 0xAAA
y = malloc(200) // returns address 0xBBB
…
{ {address, size}, {0xAAA, 100}, {0xBBB, 200}, … }
The return value is loosely specified on purpose -- different techniques will have different spatial resolution and different overheads.
Please note:
This is a highly uncommon programming question, hence if you think it should be closed let me know why and, ideally, how to phrase/place it so that it follows the guidelines. :-)