5

Is there a simple tool, or maybe a method to turn strace output into something that can be visualised or otherwise easier to sift through? I am having to figure out where an application is going wrong, but stracing it produces massive amounts of data. Trying to trace what this application and its threads are doing (or trying to do) on a larger scale is proving to be very difficult to do reading every system call.

I have no budget for anything, and we are a pure Linux shop.

tMC
  • 18,105
  • 14
  • 62
  • 98
  • Knowing what sort of problem you are having would help. However, in general when I am doing hairy strace work I use grep a lot and if the problem is a core dump or particular system call of a class I can guess, focus there and use that to guide my trail backwards. – Seth Robertson Jun 22 '11 at 19:51
  • @seth The problem is the application looks to be trying to connect to something... it faults with a connection failed error however, sniffing the network interfaces, it never tries to connect to anything outside the box. I see 2 tcp sessions established and ended over the loopback with no data ever transmitted. – tMC Jun 22 '11 at 19:56
  • I would `egrep 'socket|connect|send' /tmp/tr` and try to see what command failed. Depending on the exact text of the error message, I might look for DNS or port lookups failing as well. – Seth Robertson Jun 22 '11 at 20:13

2 Answers2

5

If your problem is a network one, you could try to limit the strace output to the network related syscalls with a

strace -e trace=network your_program

Cédric Julien
  • 78,516
  • 15
  • 127
  • 132
4

Yes, use -c parameter to visualise count time, calls, and errors for each syscall and report summary in form of table, e.g.

$ strace -c -fp $(pgrep -n php)
Process 11208 attached
^CProcess 11208 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 83.78    0.112292          57      1953       152 stat
  7.80    0.010454          56       188           lstat
  7.79    0.010439          28       376           access
  0.44    0.000584           0      5342        32 recvfrom
  0.15    0.000203           0      3985           sendto
  0.04    0.000052           0     27184           gettimeofday
  0.00    0.000000           0         6           write
  0.00    0.000000           0      3888           poll
------ ----------- ----------- --------- --------- ----------------
100.00    0.134024                 42922       184 total

This will identify the problem without having analysing large amount of data.

Another way is to filter by specific syscalls (such as recvfrom/sendto) to visualize received data and sent, example debugging PHP process:

strace -e recvfrom,sendto -fp $(pgrep -n php) -s 1000 2>&1 | while read -r line; do
  printf "%b" $line;
done | strings

Related: How to parse strace in shell into plain text?

kenorb
  • 155,785
  • 88
  • 678
  • 743