0

I am currently working on fuzzing a program, and the code base is huge. To improve the performance, I am using persistent mode by creating a loop around the necessary function or code that reads from stdin. Right now using gdb, I am able to enumerate all the functions being used by the program like this:

set logging on
set confirm off
rbreak ^[^@]*$
run the binary 
continue

This gives me all the functions that the program uses, but I think an easier way than reading hundreds of lines is by finding the function that reads from stdin. How would I be able to find the function that reads from stdin?

2 Answers2

0

How would I be able to find the function that reads from stdin?

In general, your question is equivalent to the halting problem. Consider this function:

ssize_t foo(int fd, void *buf, size_t count) { return read(fd, buf, count); }

Does this function read from stdin? It may or may not (depending on inputs), and therein lies the problem.

P.S. Your method of enumerating all functions that are called is exceedingly inefficient. You should look into building your program with -finstrument-functions instead. Example.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • What would you suggest I do to find the part in the code that reads from stdin? I will try using the more efficient method of enumerating functions, but that will still require manual filtering from me. For another part of the binary, it was easy for me to find out because the binary took a file as an input. – lemonadeice Dec 08 '20 at 02:31
0

Since you're running Linux, virtually every function that reads from a stream (such as stdin) will ultimately do a read system call. (Less often, they will call readv.)

The C prototype for the read function is

ssize_t read(int fd, void *buf, size_t count);

and like most Linux system calls, this is pretty much the prototype for the actual system call (all the integer and pointer types are put into registers.)

On x86_64, the first argument to a system call will be in register rdi. (See Calling conventions.) A value of 0 means stdin.

So first we will tell GDB to stop the process upon entering the read system call, adding a condition to stop only when its first argument is 0:

(gdb) catch syscall read
Catchpoint 1 (syscall 'read' [0])
(gdb) condition 1 $rdi == 0
(gdb) run
Starting program: cat

Catchpoint 1 (call to syscall read), 0x00007fffff13b910 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:84
84      ../sysdeps/unix/syscall-template.S: No such file or directory.

Now do a backtrace to see all the functions in the call stack:

(gdb) bt
#0  0x00007fffff13b910 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007fffff0d2a84 in __GI__IO_file_xsgetn (fp=0x7fffff3f98c0 <_IO_2_1_stdin_>, data=<optimized out>, n=4096)
    at fileops.c:1442
#2  0x00007fffff0c7ad9 in __GI__IO_fread (buf=<optimized out>, size=1, count=4096, fp=0x7fffff3f98c0 <_IO_2_1_stdin_>)
    at iofread.c:38
#3  0x00000000080007c2 in copy () at cat.c:6
#4  0x00000000080007de in main () at cat.c:12
(gdb) fr 3
#3  0x00000000080007c2 in copy () at cat.c:6
6               while ((n=fread(buf, 1, sizeof buf, stdin)) > 0)
(gdb) fr 4
#4  0x00000000080007de in main () at cat.c:12
12              copy();
Mark Plotnick
  • 9,598
  • 1
  • 24
  • 40
  • I am going to try this out, and I will let you know how well this works. Thanks for the detailed response – lemonadeice Dec 08 '20 at 22:04
  • This worked slightly. The program I am fuzzing requires a preloaded library for it to operate on stdin, so when I backtrace, it shows me functions from there. Is there a way to view a longer call stack, that maybe shows more intiators of this call? I think I found it; it was bt -full – lemonadeice Dec 08 '20 at 22:11
  • I found the function by using your method and my method above. In gdb, I set all my functions as breakpoint and the stdin read system call with condition, and I could succesfully trace my way up. – lemonadeice Dec 09 '20 at 00:22