Short description of problem/question
In bpf samples, the libbpf calls in user program were successful:
bpf_object__open_file
bpf_object__load
But the following ones failed:
bpf_object__find_program_by_name
bpf_object__find_map_fd_by_name
Ho to debug it?
Detailed
Maybe a very newbie question, but I’m puzzled, and hope somebody could help. A bpf sample based on libbpf is compiled and loaded, but a program is not found. I’m playing with bpf samples that go along kernel sources, and initially everything worked fine:
- Cloned kernel sources and libbpf
- Built libbpf library and a sample, e.g. cpustat - all as per instructions in readme files.
- Successfully ran a sample (using sudo not to bother with capabilities). It’s Ubuntu 20.04, kernel 5.15.47.
- Started writing my own bpf program, based on a sample.
My first program didn’t work as it returned error in libbpf call bpf_object__find_program_by_name. Was trying to debug it and tried the original sample – it surprisingly returned an error in the same place. Now and on I was debugging the sample. So the user part cpustat_user.c is calling:
bpf_object__open_file – success
bpf_object__find_program_by_name – failed
some other samples (like ibumad_user.c) do not have bpf_object__find_program_by_name, so once file is opened, the next call:
bpf_object__load – success
bpf_object__find_map_fd_by_name – fail
So, even though the object is loaded, nothing is accessible, not program name, not maps, etc.
I was debugging a little bit libbpf:
sudo gdb ./cpustat
(gdb) br cpustat_user.c:208
Breakpoint 1 at 0x5ee3: file /home/vtsymbal/kernel-src/samples/bpf/cpustat_user.c, line 208.
(gdb) run
Starting program: /home/vtsymbal/kernel-src/samples/bpf/cpustat
Breakpoint 1, main (argc=<optimized out>, argv=<optimized out>) at /home/vtsymbal/kernel-src/samples/bpf/cpustat_user.c:208
208 prog = bpf_object__find_program_by_name(obj, "bpf_prog1");
(gdb) s
bpf_object__find_program_by_name (obj=0x55555559b2a0, name=0x5555555851b0 "bpf_prog1") at libbpf.c:3963
3963 {
(gdb) s
3966 bpf_object__for_each_program(prog, obj) {
(gdb) s
bpf_object__next_program (obj=obj@entry=0x55555559b2a0, prev=prev@entry=0x0) at libbpf.c:8322
8322 {
(gdb) s
8326 prog = __bpf_program__iter(prog, obj, true);
(gdb) s
__bpf_program__iter (forward=<optimized out>, obj=<optimized out>, p=<optimized out>) at libbpf.c:8323
8323 struct bpf_program *prog = prev;
(gdb) s
8326 prog = __bpf_program__iter(prog, obj, true);
(gdb) s
bpf_object__find_program_by_name (obj=0x55555559b2a0, name=0x5555555851b0 "bpf_prog1") at libbpf.c:3966
3966 bpf_object__for_each_program(prog, obj) {
(gdb) n
3972 return errno = ENOENT, NULL;
There is no program with the name (actually, no programs at all in the list). The question is, how to debug it further?
I tried to change llvm compiler, installed one with apt install or built it from sources. It didn't help. I compiled the samples without modifying Makefile: make M=samples/bpf LLC=~/llvm-project/llvm/build/bin/llc CLANG=~/llvm-project/llvm/build/bin/clang
I also tried libbpf from the kernel src/tools/bpf and built the library from cloned sources.
Now there is some weird thing, but that might shed a light on the problem for those who might have some ideas. I took a “clean” machine and did the installation/building bpf samples exercise again. The samples worked fine. Then I substituted one sample with my own code, recompiled and got the same error as before. Returning back the original samples code – they are not working anymore. Moving the built sample to another machine doesn’t make is working. So, it’s something not in machine configuration, but rather compilation. Any ideas would be highly appreciated.