1

Short description of problem/question

In bpf samples, the libbpf calls in user program were successful:

bpf_object__open_file
bpf_object__load

But the following ones failed:

bpf_object__find_program_by_name
bpf_object__find_map_fd_by_name

Ho to debug it?

Detailed

Maybe a very newbie question, but I’m puzzled, and hope somebody could help. A bpf sample based on libbpf is compiled and loaded, but a program is not found. I’m playing with bpf samples that go along kernel sources, and initially everything worked fine:

  • Cloned kernel sources and libbpf
  • Built libbpf library and a sample, e.g. cpustat - all as per instructions in readme files.
  • Successfully ran a sample (using sudo not to bother with capabilities). It’s Ubuntu 20.04, kernel 5.15.47.
  • Started writing my own bpf program, based on a sample.

My first program didn’t work as it returned error in libbpf call bpf_object__find_program_by_name. Was trying to debug it and tried the original sample – it surprisingly returned an error in the same place. Now and on I was debugging the sample. So the user part cpustat_user.c is calling:

bpf_object__open_file – success
bpf_object__find_program_by_name – failed

some other samples (like ibumad_user.c) do not have bpf_object__find_program_by_name, so once file is opened, the next call:

bpf_object__load – success
bpf_object__find_map_fd_by_name – fail

So, even though the object is loaded, nothing is accessible, not program name, not maps, etc.

I was debugging a little bit libbpf:

sudo gdb ./cpustat
(gdb) br cpustat_user.c:208
Breakpoint 1 at 0x5ee3: file /home/vtsymbal/kernel-src/samples/bpf/cpustat_user.c, line 208.
(gdb) run
Starting program: /home/vtsymbal/kernel-src/samples/bpf/cpustat

Breakpoint 1, main (argc=<optimized out>, argv=<optimized out>) at /home/vtsymbal/kernel-src/samples/bpf/cpustat_user.c:208
208             prog = bpf_object__find_program_by_name(obj, "bpf_prog1");
(gdb) s
bpf_object__find_program_by_name (obj=0x55555559b2a0, name=0x5555555851b0 "bpf_prog1") at libbpf.c:3963
3963    {
(gdb) s
3966            bpf_object__for_each_program(prog, obj) {
(gdb) s
bpf_object__next_program (obj=obj@entry=0x55555559b2a0, prev=prev@entry=0x0) at libbpf.c:8322
8322    {
(gdb) s
8326                    prog = __bpf_program__iter(prog, obj, true);
(gdb) s
__bpf_program__iter (forward=<optimized out>, obj=<optimized out>, p=<optimized out>) at libbpf.c:8323
8323            struct bpf_program *prog = prev;
(gdb) s
8326                    prog = __bpf_program__iter(prog, obj, true);
(gdb) s
bpf_object__find_program_by_name (obj=0x55555559b2a0, name=0x5555555851b0 "bpf_prog1") at libbpf.c:3966
3966            bpf_object__for_each_program(prog, obj) {
(gdb) n
3972            return errno = ENOENT, NULL;

There is no program with the name (actually, no programs at all in the list). The question is, how to debug it further?

I tried to change llvm compiler, installed one with apt install or built it from sources. It didn't help. I compiled the samples without modifying Makefile: make M=samples/bpf LLC=~/llvm-project/llvm/build/bin/llc CLANG=~/llvm-project/llvm/build/bin/clang

I also tried libbpf from the kernel src/tools/bpf and built the library from cloned sources.

Now there is some weird thing, but that might shed a light on the problem for those who might have some ideas. I took a “clean” machine and did the installation/building bpf samples exercise again. The samples worked fine. Then I substituted one sample with my own code, recompiled and got the same error as before. Returning back the original samples code – they are not working anymore. Moving the built sample to another machine doesn’t make is working. So, it’s something not in machine configuration, but rather compilation. Any ideas would be highly appreciated.

vtsymbal
  • 21
  • 2
  • Try compiling libbpf with `-O0` and debug symbols so you can step into the functions and examine further what happens? Try from a working example and test incrementally with your change to pin down what change breaks the code? Add your code or changes to your question if you want readers to have a look? Try `strace -e bpf` to see what syscalls return? Try `bpftool prog show` to see if programs were indeed loaded (you need to pause your program to run bpftool, they may unload when your process exit). If regular samples no longer work, is there anything relevant in`dmesg`? – Qeole Mar 10 '23 at 09:49
  • Thank you @Qeole for the hints! Unfortunately, debugging didn't bring anything new, but now it's more clear that a kernel module is not loaded. I'm using a standard samples (cpustat), just not to bother with my own program. 'strace' showed just 'openat' system call for cpustat_kern.o, but no other useful info. The 'bpftool prog show' didn't show any relevant programs. Nothing in the 'dmesg'. I did compiled libbpf with -O0 and debugged it. It just confirmed that in '__bpf_program__iter' there are no programs, i.e. 'nr_programs = 0'. So the Q would be how to find out reasons of not loaded – vtsymbal Mar 13 '23 at 22:01

1 Answers1

0

This is likely because your ELF object file containing the eBPF program (cpustat_kern.o) was not compiled correctly.

Check with llvm-objdump:

$ llvm-objdump -S cpustat_kern.o

If the file was correctly compiled, llvm-dump should dump all the program instructions. I managed to reproduce your error, and would only get cpustat_kern.o: file format elf64-bpf as output: the missing instructions indicate that something is wrong.

If you try to delete and recompile the program, you can probably observe an error message somewhere in the logs, even though we do create a (corrupted) cpustat_kern.o.

$ rm cpustat_kern.o
$ make -j
[...]
  CLANG-bpf  /path/to/linux/samples/bpf/cpustat_kern.o                                                                                                                                                                                                                                     
In file included from <built-in>:3:                                                                                                                                                                                                                                                         
In file included from /path/to/linux/samples/bpf/asm_goto_workaround.h:10:                                                                                                                                                                                                                 
In file included from ./include/linux/types.h:6:                                                                                                                                                                                                                                            
./include/uapi/linux/types.h:5:10: fatal error: 'asm/types.h' file not found                                                                                                                                                                                                                
#include <asm/types.h>                                                                                                                                                                                                                                                                      
         ^~~~~~~~~~~~~                                                                                                                                                                                                                                                                      
1 error generated.
[...]
$ ls cpustat_kern.o
cpustat_kern.o

The error above, for example, could occur if you haven't installed the kernel headers on your system (which would be reported at the beginning of the compilation):

$ VMLINUX_BTF=/sys/kernel/btf/vmlinux make -j         
make -C ../../ M=/path/to/linux/samples/bpf BPF_SAMPLES_PATH=/path/to/linux/samples/bpf                                                     
make[1]: Entering directory '/path/to/linux'                                                                                                 
/path/to/linux/samples/bpf/Makefile:243: WARNING: Detected possible issues with include path.                                                
/path/to/linux/samples/bpf/Makefile:244: WARNING: Please install kernel headers locally (make headers_install).
Qeole
  • 8,284
  • 1
  • 24
  • 52
  • You are right @Qeole! I get 'cpustat_kern.o: file format elf64-bpf' as llvm-objdump output. There is a 'WARNING: Detected possible issues with include path and an error', during build, that I thought was not relevant.. But I do install kernel headers (make headers_install), and they go to '/usr/include' directory. In the bpf samples Makefile I added the path 'TPROGS_CFLAGS += -I$(srctree)/usr/include', but surprisingly it did help to compile without the warnings. – vtsymbal Mar 14 '23 at 15:39
  • Good, did you manage to make `cpustat` work then? – Qeole Mar 15 '23 at 11:02
  • Oh sorry, I meant "it didn't help", but wrote a typo "did" in my coment. I'm trying to figure out what TPROGS_CFLAGS is affecting during bpf samples compilation. The warning is triggered by checking the HDR_PROBE variable in the Makefile, which checks for linux/types.h availability and which I did installed. – vtsymbal Mar 15 '23 at 14:18
  • I think I have this problem, described in the patch: [link](https://lists.linuxfoundation.org/pipermail/virtualization/2020-July/048505.html), but my kernel (5.15.47) does contain it. However, the error still exists "./include/linux/compiler.h:255:10: fatal error: 'asm/rwonce.h' file not found" – vtsymbal Mar 16 '23 at 12:07