5

I want to find the path of a UDP packet in the Linux kernel. For this, I want to read up on some documentation ( I have this so far, which is for TCP) and then have some printk statements in the relevant kernel functions to confirm that. I will do this by recompiling the kernel code.

Is this the way to go about it? Do you have any suggestions/references?

Phani
  • 3,267
  • 4
  • 25
  • 50
  • 1
    W.R. Stevens, *TCP/IP Illustrated,* Volume II, gives a complete account for the BSD kernel, which I suspect would be just as informative, and no kernel meddling required. – user207421 Jan 18 '13 at 22:21

4 Answers4

7

Specifically answering your question, to understand UDP processing for IPv4 you can use ftrace, as is done here:

At the ingress (receiving side):

 96882  2)               |                                ip_local_deliver_finish() {
 96883  2)   0.069 us    |                                  raw_local_deliver();
 96884  2)               |                                  udp_rcv() {
 96885  2)               |                                    __udp4_lib_rcv() {
 96886  2)   0.087 us    |                                      __udp4_lib_lookup();
 96887  2)               |                                      __skb_checksum_complete_head() {
 96888  2)               |                                        skb_checksum() {
 96889  2)               |                                          __skb_checksum() {
 96890  2)               |                                            csum_partial() {
 96891  2)   0.161 us    |                                              do_csum();
 96892  2)   0.536 us    |                                            }
 96893  2)               |                                            csum_partial() {
 96894  2)   0.167 us    |                                              do_csum();
 96895  2)   0.523 us    |                                            }
 96896  2)               |                                            csum_partial() {
 96897  2)   0.158 us    |                                              do_csum();
 96898  2)   0.513 us    |                                            }
 96899  2)               |                                            csum_partial() {
 96900  2)   0.154 us    |                                              do_csum();
 96901  2)   0.502 us    |                                            }
 96902  2)               |                                            csum_partial() {
 96903  2)   0.165 us    |                                              do_csum();
 96904  2)   0.516 us    |                                            }
 96905  2)               |                                            csum_partial() {
 96906  2)   0.138 us    |                                              do_csum();
 96907  2)   0.506 us    |                                            }
 96908  2)   5.462 us    |                                          }
 96909  2)   5.840 us    |                                        }
 96910  2)   6.204 us    |                                      }

Another part of the tracing show below:

 98212  2)               |                              ip_rcv() {
 98213  2)               |                                ip_rcv_finish() {
 98214  2)   0.109 us    |                                  udp_v4_early_demux();
 98215  2)               |                                  ip_route_input_noref() {
 98216  2)               |                                    fib_table_lookup() {
 98217  2)   0.079 us    |                                      check_leaf.isra.8();
 98218  2)   0.492 us    |                                    }

And for egress of networking code, some snippets are extracted below:

 4)   0.547 us    |  udp_poll();
 4)               |  udp_sendmsg() {
 4)               |    udp_send_skb() {
 4)   0.387 us    |      udp_error [nf_conntrack]();
 4)   0.185 us    |      udp_pkt_to_tuple [nf_conntrack]();
 4)   0.160 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.151 us    |      udp_get_timeouts [nf_conntrack]();
 4)   0.145 us    |      udp_new [nf_conntrack]();
 4)   0.160 us    |      udp_get_timeouts [nf_conntrack]();
 4)   0.261 us    |      udp_packet [nf_conntrack]();
 4)   0.181 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.195 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.170 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.175 us    |      udp_invert_tuple [nf_conntrack]();
 4)               |      udp_rcv() {
 4) + 15.021 us   |        udp_queue_rcv_skb();
 4) + 18.829 us   |      }
 4) + 82.100 us   |    }
 4) + 92.415 us   |  }
 4)               |  udp_sendmsg() {
 4)               |    udp_send_skb() {
 4)   0.226 us    |      udp_error [nf_conntrack]();
 4)   0.150 us    |      udp_pkt_to_tuple [nf_conntrack]();
 4)   0.146 us    |      udp_get_timeouts [nf_conntrack]();
 4)   1.098 us    |      udp_packet [nf_conntrack]();
 4)               |      udp_rcv() {
 4)   1.314 us    |        udp_queue_rcv_skb();
 4)   3.282 us    |      }
 4) + 20.646 us   |    }

The above is called function graph in ftrace:

How to make a linux kernel function available to ftrace function_graph tracer?

And my bashscript for tracing udp are as follows (to be run as root):

#!/bin/bash

mkdir /debug
mount -t debugfs nodev /debug
mount -t debugfs nodev /sys/kernel/debug
echo udp_* >/debug/tracing/set_ftrace_filter
echo function_graph >/debug/tracing/current_tracer
echo 1 >/debug/tracing/tracing_on
sleep 20
echo 0 >/debug/tracing/tracing_on
cat /debug/tracing/trace > /tmp/tracing.out$$

Now the output file is locate inside the /tmp/tracing.out where is the shell script process. The purpose of 20 seconds is to allow userspace activities to happen - just starts lots of UDP activities at this point. You can also remove "echo udp_* >/debug/tracing/set_ftrace_filter" from above script, because the default is to trace everything.

Community
  • 1
  • 1
Peter Teoh
  • 6,337
  • 4
  • 42
  • 58
  • This functions are not the entry point on the kernel, before the kernel decide to switch to this functions there are a lot of steps before! – TOC Jan 20 '13 at 17:13
  • of course these are not the entry points - there are so many. but the OP is asking for the UDP starting point. – Peter Teoh Apr 13 '13 at 01:20
  • after some changes, now the listing of tracing output above are all the points of execution inside the kernel. "entry points" for UDP processing is very vague, depending on your definition, but the list of the functions are shown above nevertheless. – Peter Teoh Mar 12 '15 at 14:52
4

The linux networking stack is a big piece of the kernel and you need to spend some time studying it. I think that this books may help (Focused on older kernels 2.4 and 2.6, but the logic remain the same for the latest kernels 3.x):

Understanding Linux Network Internals

The Linux Networking Architecture - Design and Implementation of Network Protocols in the Linux Kernel

You can also checkout this links:

http://e-university.wisdomjobs.com/linux/chapter-189-277/sending-the-data-from-the-socket-through-udp-and-tcp.html

http://www.linuxfoundation.org/collaborate/workgroups/networking/kernel_flow

http://wiki.openwrt.org/doc/networking/praxis

http://www.ibm.com/developerworks/linux/library/l-linux-networking-stack/?ca=dgr-lnxw01lnxNetStack

http://gicl.cs.drexel.edu/people/sevy/network/Linux_network_stack_walkthrough.html

You need also to browse the kernel source :

http://lxr.linux.no/#linux+v3.7.3/

Begin your road to the network sub-system with this function : ip_rcv which is called when a packet is received. other functions are then called (ip_rcv_finish, ip_local_deliver and ip_local_deliver_finish=> This function is responsible for choosing the good transport layer)

TOC
  • 4,326
  • 18
  • 21
2

If you prefer a more visual way, try flame-grahps. Here is an example of UDP transmit flow (using netperf to transmit UDP packets): enter image description here

And here is the same graph zoomed-in on udp_send_skb: enter image description here

You can do the same for any relevant flow in the kernel. You can also search for specific functions or key-words and zoom in/out. This also gives you an idea of the heavier functions in the flow.

Hope this helps.

Tgilgul
  • 1,614
  • 1
  • 20
  • 35
0

This link is good too. It talks about udp and underlying IP, NIC, etc :

https://blog.packagecloud.io/eng/2017/02/06/monitoring-tuning-linux-networking-stack-sending-data/

Jiang
  • 491
  • 5
  • 9
  • Kindly add context to any links so your Answer is self contained, meaning the answer needs to be here in the Answer itself. See ["Provide context for links"](https://stackoverflow.com/help/how-to-answer). It would be preferable if you could answer the Question in your own words here and link only as a reference. – Scratte Mar 27 '21 at 10:04