I am trying to do this: I am sending packet through iperf-an open source tool from one machine to another and I want to trace the write system or send call. Please help me to do that,if someone can guide through Ftrace framework to trace system call that would be great otherwise by any other tracing tool.

- 347,512
- 102
- 1,199
- 985

- 101
- 1
- 10
-
2Possible duplicate of [How do I trace a system call in Linux?](https://stackoverflow.com/questions/29840213/how-do-i-trace-a-system-call-in-linux) Voting to close this way because the other is more generic and has more upvotes. – Ciro Santilli OurBigBook.com Oct 12 '18 at 09:29
4 Answers
The hard part is to know exactly what to trace so you can see only the results that you want, but the tracing itself is very easy:
- First, your kernel must be configured with CONFIG_FTRACE=y
Determine what event you want to trace
cat /sys/kernel/debug/tracing/available_events
Write the event that you have chosen into set_event
echo sys_enter_write > /sys/kernel/debug/tracing/set_event
Determine what kind of trace you want to trace
cat /sys/kernel/debug/tracing/available_tracers
Write the kind of trace that you want in current_tracer file
echo function_graph > /sys/kernel/debug/tracing/current_tracer
Enable tracing:
echo 1 > /sys/kernel/debug/tracing/tracing_on
Now you can run the "iperf -c... " as you wish, and disable tracing when it is done.
echo 0 > /sys/kernel/debug/tracing/tracing_on
To see the results:
vi /sys/kernel/debug/tracing/trace

- 1,097
- 1
- 15
- 25
-
1i am getting the following error echo sys_enter_write > /sys/kernel/debug/tracing/current_tracer -bash: echo: write error: Invalid argument [root@server tracing]# echo sys_enter_write > current_tracer -bash: echo: write error: Invalid argument – user3145350 Feb 21 '14 at 18:07
-
-
-
Please see this video [Video demo](http://www.screenr.com/PzNN) i tried this on ubuntu and redhat, both worked. Maybe you could try to mount the debugfs on another directory using: mkdir /debug
mount -t debugfs debugfs /debug – Edgar Feb 23 '14 at 20:54cd /debug/tracing and execute the echo commands inside like I did on the video. -
I managed to get the Invalid argument. I will update my answer in order to better explain how to trace events. My answer as it is now induced you to get this error. – Edgar Feb 23 '14 at 21:14
-
Doing `echo function_graph > /sys/kernel/debug/tracing/current_tracer` makes it trace all functions, and generates massive output, you likely want `nop` instead here. See: https://stackoverflow.com/questions/21889563/how-to-trace-the-write-system-call-in-the-linux-kernel/52776478#52776478 – Ciro Santilli OurBigBook.com Oct 12 '18 at 09:30
strace
If you are going to just trace one process, why not just use strace
? It can even connect to running processes: How does strace connect to an already running process?
For ftrace, use echo none > /sys/kernel/debug/tracing/current_tracer
to trace just the system calls
As of Linux 4.15 at least, if you use function_graph
instead, then it shows a ton of functions in addition to the system call.
You could work around that with filtering, but it is simpler to just use nop
: How to trace just system call events with ftrace without showing any other functions in the Linux kernel?
Run with sudo
:
#!/bin/sh
set -eux
d=debug/tracing
mkdir -p debug
if ! mountpoint -q debug; then
mount -t debugfs nodev debug
fi
# Stop tracing.
echo 0 > "${d}/tracing_on"
# Clear previous traces.
echo > "${d}/trace"
# Find the tracer name.
cat "${d}/available_tracers"
# Disable tracing functions, show only system call events.
echo nop > "${d}/current_tracer"
# Find the event name with.
grep write "${d}/available_events"
# Enable tracing write.
# Both statements below seem to do the exact same thing,
# just with different interfaces.
# https://www.kernel.org/doc/html/v4.18/trace/events.html
echo sys_enter_write > "${d}/set_event"
# echo 1 > "${d}/events/syscalls/sys_enter_write/enable"
# Start tracing.
echo 1 > "${d}/tracing_on"
# Generate two write calls by two different processes.
rm -rf /tmp/a /tmp/b
printf a > /tmp/a
printf b > /tmp/b
# View the trace.
cat "${d}/trace"
# Stop tracing.
echo 0 > "${d}/tracing_on"
umount debug
and a sample output:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
a.sh-18135 [004] .... 11152.454767: sys_write(fd: 2, buf: 5558769acc00, count: 2)
a.sh-18135 [004] .... 11152.454777: sys_write(fd: 2, buf: 555877f6a968, count: 2)
a.sh-18135 [004] .... 11152.454785: sys_write(fd: 2, buf: 555877f6a968, count: 4)
a.sh-18135 [004] .... 11152.454793: sys_write(fd: 2, buf: 555877f6a968, count: 7)
a.sh-18135 [004] .... 11152.454801: sys_write(fd: 2, buf: 555877f6a968, count: 7)
a.sh-18135 [004] .... 11152.454807: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
gnome-terminal--3419 [005] .... 11152.454833: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.454862: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.454887: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.454894: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gmain-3193 [002] .... 11152.456141: sys_write(fd: 4, buf: 7fbfe5a93c40, count: 8)
gmain-3193 [002] .... 11152.456168: sys_write(fd: 4, buf: 7fbfe5a92bd0, count: 8)
gmain-3193 [002] .... 11152.456172: sys_write(fd: 4, buf: 7fbfe5a93c40, count: 8)
a.sh-18135 [004] .... 11152.456534: sys_write(fd: 2, buf: 5558769acbd8, count: 2)
a.sh-18135 [004] .... 11152.456547: sys_write(fd: 2, buf: 555877f6a968, count: 6)
a.sh-18135 [004] .... 11152.456555: sys_write(fd: 2, buf: 555877f6a968, count: 2)
a.sh-18135 [004] .... 11152.456561: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
a.sh-18135 [004] .... 11152.456578: sys_write(fd: 1, buf: 555877f6af00, count: 1)
gnome-terminal--3419 [005] .... 11152.456651: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.456660: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
a.sh-18135 [004] .... 11152.456674: sys_write(fd: 2, buf: 5558769acbd8, count: 2)
a.sh-18135 [004] .... 11152.456683: sys_write(fd: 2, buf: 555877f6a968, count: 6)
a.sh-18135 [004] .... 11152.456690: sys_write(fd: 2, buf: 555877f6a968, count: 2)
a.sh-18135 [004] .... 11152.456694: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
a.sh-18135 [004] .... 11152.456703: sys_write(fd: 1, buf: 555877f6af00, count: 1)
a.sh-18135 [004] .... 11152.456738: sys_write(fd: 2, buf: 5558769acb88, count: 2)
a.sh-18135 [004] .... 11152.456745: sys_write(fd: 2, buf: 555877f6a968, count: 3)
a.sh-18135 [004] .... 11152.456750: sys_write(fd: 2, buf: 555877f6a968, count: 14)
a.sh-18135 [004] .... 11152.456756: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
gnome-terminal--3419 [005] .... 11152.456816: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.456845: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
As you can see, write
is a very common system call, so it is difficult to isolate our two write calls. Things would be easier to see with a less common call, e.g. mkdir
: How do I trace a system call in Linux?
Tested on Ubuntu 18.04, Linux kernel 4.15.

- 347,512
- 102
- 1,199
- 985
-
Is there a way to change the syscall trace output format to ftrace. eg in case of sys_enter_write one of output is filename: 5af693f224 as corresponding format is filename: 0x%08lx. Is it somehow possible to change format to get this as a string. Would be good if such a change only effects specified system call trace and not others. https://stackoverflow.com/questions/55443204/change-format-of-syscall-event-trace-output-to-ftrace – Madhur Rawat Apr 17 '19 at 09:20
-
sorry @rawatm but I don't know the answer :-) Do try to grep the kernel source a bit as well ;-) – Ciro Santilli OurBigBook.com Apr 17 '19 at 10:18
To get call graph of write syscall, you can:
sudo trace-cmd record -p function_graph -g vfs_write your-prog
your-prog can be simple write program like:
#include <stdio.h>
#include <sys/file.h>
int main()
{
int fd = open("a.txt", O_WRONLY | O_CREAT);
write(fd, "abcdefg", 6);
close(fd);
}
Then trace-cmd report | grep -v trace-cmd | less

- 4,824
- 3
- 28
- 55
Apart from ftrace, you can also use perf like:
sudo perf record --call-graph fp ./fio --filename=aa --size=4G --direct=0 --rw=write --bs=1M --iodepth=512 --numjobs=1 --name=job_name
Then:
sudo perf report --call-graph
and expand vfs_write:
Children Self Command Shared Object Symbol
+ 80.17% 0.00% fio [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe ◆
+ 80.15% 0.02% fio [kernel.kallsyms] [k] do_syscall_64 ▒
+ 78.57% 0.07% fio libpthread-2.31.so [.] __libc_pwrite64 ▒
- 78.49% 0.04% fio [kernel.kallsyms] [k] vfs_write ▒
- 78.46% vfs_write ▒
- 78.40% new_sync_write ▒
- 78.37% ext4_file_write_iter ▒
- 78.33% ext4_buffered_write_iter ▒
- 77.73% generic_perform_write ▒
- 50.15% ext4_da_write_begin ▒
- 38.72% grab_cache_page_write_begin ▒
- 38.61% pagecache_get_page ▒
- 23.52% __page_cache_alloc ▒
- 23.16% alloc_pages ▒
- 22.63% __alloc_pages ▒
- 22.00% get_page_from_freelist ▒
- 18.16% prep_new_page ▒
17.84% clear_page_rep ▒
1.80% rmqueue ▒
1.28% node_dirty_ok ▒
- 12.84% add_to_page_cache_lru ▒
- 9.77% __add_to_page_cache_locked ▒
- 1.89% __mem_cgroup_charge ▒
1.01% charge_memcg ▒
0.77% get_mem_cgroup_from_mm ▒
- 1.09% xa_get_order ▒
- 0.89% xas_load ▒
0.53% xas_descend ▒
- 2.69% lru_cache_add ▒
- 2.37% __pagevec_lru_add ▒
2.01% __lock_text_start ▒
- 1.21% xas_load ▒
0.66% xas_descend ▒
- 10.38% ext4_block_write_begin ▒
- 7.01% create_empty_buffers ▒
- 5.98% alloc_page_buffers ▒
- 5.59% alloc_buffer_head ▒
- 4.24% kmem_cache_alloc ▒
- 1.03% ___slab_alloc ▒
- new_slab ▒
- 0.87% allocate_slab ▒
- 0.57% alloc_pages ▒
- 0.55% __alloc_pages ▒
get_page_from_freelist ▒
0.84% get_obj_cgroup_from_current ▒
0.53% memset ▒
- 2.42% ext4_da_get_block_prep ▒
- 1.26% ext4_da_map_blocks.constprop.0 ▒
0.53% ext4_es_lookup_extent ▒
- 16.27% copy_page_from_iter_atomic ▒
copy_user_generic_string ▒
- 8.74% ext4_da_write_end ▒
- 8.23% generic_write_end ▒
- 7.83% block_write_end ▒
- __block_commit_write.isra.0 ▒
- 6.89% mark_buffer_dirty ▒
- 5.66% __set_page_dirty ▒
5.54% __lock_text_start ▒
1.35% __get_user_nocheck_1 ▒
+ 78.48% 0.00% fio [kernel.kallsyms] [k] __x64_sys_pwrite64 ▒
+ 78.48% 0.00% fio [kernel.kallsyms] [k] ksys_pwrite64 ▒
+ 78.40% 0.02% fio [kernel.kallsyms] [k] new_sync_write ▒
+ 78.37% 0.02% fio [kernel.kallsyms] [k] ext4_file_write_iter ▒
+ 78.33% 0.00% fio [kernel.kallsyms] [k] ext4_buffered_write_iter ▒
+ 77.73% 0.27% fio [kernel.kallsyms] [k] generic_perform_write ▒
+ 50.28% 0.45% fio [kernel.kallsyms] [k] ext4_da_write_begin ▒
+ 38.76% 0.07% fio [kernel.kallsyms] [k] grab_cache_page_write_begin ▒
+ 38.74% 0.91% fio [kernel.kallsyms] [k] pagecache_get_page ▒
+ 25.19% 0.66% fio [kernel.kallsyms] [k] __alloc_pages ▒
+ 24.44% 0.64% fio [kernel.kallsyms] [k] get_page_from_freelist

- 4,824
- 3
- 28
- 55