I'm trying to use angr to verify a function's behavior by deriving the correct input for a given result. The function modifies a buffer, in this case it simply copies the input to it, so I set up symbolic variables on the heap that I apply an equality constraint to. I then create a symbolic variable for the input to the function. What I intended to happen is angr essentially runs the function till an input is found such that when it's copied to the buffer it satisfies the constraint placed on it. However, when it runs, I only get one deadended branch, where the output has the correct value, but the input does not. I've attached the solver and the test program source below. Is this a failure of my implementation? Or is this approach to solving the problem invalid
the solver
import angr
import claripy
import angr.sim_type as T
if __name__ == '__main__':
path = './test'
target = 'test_fn'
target_buf = 'buffer'
word = b'hello\0'
program = angr.Project(path)
fn = program.loader.find_symbol(target)
buf = program.loader.find_symbol(target_buf)
fn_addr = fn.rebased_addr
buf_addr = buf.rebased_addr
cc = program.factory.cc(func_ty=T.SimTypeFunction(
args=[T.parse_type('char*')],
returnty=T.parse_type('void')
))
str_in = claripy.BVS('input', 8 * len(word))
fn_call_state = program.factory.call_state(fn_addr, angr.calling_conventions.PointerWrapper(str_in), cc=cc)
ptr_out = fn_call_state.heap.allocate(len(word))
chars = []
for i, v in enumerate(word):
ch = claripy.BVS('char_{}'.format(i), 8)
chars.append(ch)
fn_call_state.solver.add(ch == int(v))
word_sym = claripy.Concat(*chars)
fn_call_state.memory.store(buf_addr, ptr_out)
fn_call_state.memory.store(ptr_out, word_sym)
simgr = program.factory.simgr(fn_call_state, save_unsat=True)
simgr.run()
print(simgr)
for s in simgr.deadended:
print(s.solver.eval(str_in, cast_to=bytes))
print(s.solver.eval(word_sym, cast_to=bytes))
the test program
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
char* buffer;
void test_fn(char* input) {
strcpy(buffer, input);
return;
}
int main(int argc, char** argv) {
buffer = (char*) malloc(100);
test_fn("hello");
}
when I run the solver, I get output like the following, though the first value, the input value, changes seemingly at random. IS there a way to apply this constraint so that the side effects of the program take precedence? I don't understand how these two values an be different. Is this an error in the way I am preparing memory?
b'\x01\x01\x01\x04\x01\x02'
b'hello\x00'