14

I would like to log all file accesses a process makes during it's lifetime in an efficient manner.

Currently, we are doing this by using LD_PRELOAD by preloading a shared library that intercepts C library calls that deal with file accesses. The method is efficient without much performance overhead, but is not leak proof.

For instance, the LD_PRELOAD shared library we have has a hook for dlopen. This hook is used to track accesses to shared libraries, but the mechanism fails to log tertiary dependencies of the shared library.

We did try using strace but the performance overhead of using strace was a non-starter for us. I was curious if we have other mechanisms that we can explore to intercept file accesses that a process and it's sub-processes makes in an efficient manner. I am open to exploring options at the kernel level, hooks into the VFS layer or anything else.

Thoughts?

user1595789
  • 261
  • 1
  • 6
  • 4
    Use kernel-based [solutions](https://hsto.org/getpro/habr/post_images/d81/438/305/d81438305a8f6e4f899c416a9733970e.jpg), like `sysdig` or tracing (lttng, systemtap, ftrace, trace-cmd, [bcc+eBPF](http://www.brendangregg.com/Perf/bcc_tracing_tools.png), [bcc pdf](http://www.brendangregg.com/Slides/SCALE2017_perf_analysis_eBPF.pdf)). Many such solutions support pid filtering. And they are specific to OS (my list is for linux, dtrace for Solaris, something for BSD?). Some details are listed on Brendan Gregg's site http://www.brendangregg.com (he is author of DTrace book) and his presentations. – osgx Apr 05 '17 at 23:12

1 Answers1

4

We did try using strace but the performance overhead of using strace was a non-starter for us.

strace is slow, as it uses ancient and slow ptrace syscall to be something like debugger for the application. Every syscall made by application will be converted into signal to strace, around two ptrace syscalls by strace (also some printing, access to other process memory for string/struct values) and continuing the target application (2 context switches). strace supports syscall filters, but filter can't be registered for ptrace, and strace does the filtering in user-space, tracing all syscalls.

There are faster kernel-based solutions, Brendan Gregg (author of the Dtrace Book - Solaris, OSX, FreeBSD) have many overviews of tracing tools (in his blog: tracing 15 minutes, BPF superpowers, 60s of linux perf, Choosing Tracer 2015 (with Magic pony), page cache stats), for example

Brendan Gregg - Linux kernel Analysis and Tools

You are interested in left part of this diagram, near VFS block. perf (standard tool), dtrace (supported only in some linuxes, have license problems - CDDL incompatible with GPL), stap (systemtap, works better with red Linuxes like CentOS).

There is direct replacement of strace - the sysdig tool (requires additional kernel module, github) which works for system calls like tcpdump works for network interface sniffing. This tool sniffs syscalls inside kernel without additional context switches or signals or poking into other process memory with ptrace (kernel already has all strings copied from user) and it also uses smart buffering to dump traces to userspace tool in huge packets.

There are other universal tracing frameworks/tools like lttng (out of tree), ftrace / trace-cmd. And bcc with eBPF is very powerful framework included in modern (4.9+) Linux kernels (check http://www.brendangregg.com/Slides/SCALE2017_perf_analysis_eBPF.pdf). bcc and eBPF allow you to write small (ans safe) code fragments to do some data aggregation in-kernel near the tracepoint:

Brendan Gregg list of bcc tools around linux kernel subsystems

Try Brendan's tools near VFS if your Linux kernel is recent enough: opensnoop, statsnoop, syncsnoop; probably some file* tools too (tools support pid filtering with -p PID or may work system-wide). They are described partially at http://www.brendangregg.com/dtrace.html and published on his github: https://github.com/brendangregg/perf-tools (also https://github.com/iovisor/bcc#tools)

As of Linux 4.9, the Linux kernel finally has similar raw capabilities as DTrace. ...

opensnoop is a program to snoop file opens. The filename and file handle are traced along with some process details.

# opensnoop -g
  UID   PID PATH                                   FD ARGS
  100  3528 /var/ld/ld.config                      -1 cat /etc/passwd
  100  3528 /usr/lib/libc.so.1                      3 cat /etc/passwd
  100  3528 /etc/passwd                             3 cat /etc/passwd   
  100  3529 /var/ld/ld.config                      -1 cal
  100  3529 /usr/lib/libc.so.1                      3 cal

rwsnoop snoop read/write events. This is measuring reads and writes at the application level - syscalls.

# rwsnoop
  UID    PID CMD          D   BYTES FILE
    0   2924 sh           R     128 /etc/profile
    0   2924 sh           R     128 /etc/profile
    0   2924 sh           R     128 /etc/profile
    0   2924 sh           R      84 /etc/profile
    0   2925 quota        R     757 /etc/nsswitch.conf
    0   2925 quota        R       0 /etc/nsswitch.conf
    0   2925 quota        R     668 /etc/passwd
Community
  • 1
  • 1
osgx
  • 90,338
  • 53
  • 357
  • 513
  • perf with tracepoint/kprobe may work too and perf has some related-process tracing capabilities - it can trace only events in started process and in all its children (it have special code to trace mmaps and forks I guess). For existing processes no tool will find out all relations, you can trace one pid or full system. Many of Gregg's tools are safe/fast enough to be used in production, and he uses them on Netflix. – osgx May 29 '17 at 05:48
  • There is also `lsof` / `fuser` tool to list files currently open (They do it just by iterating over `/proc/$PID/fd` directories. – osgx May 29 '17 at 06:21