We are using openonload with zerocopy (for multicast operations) feature to receive and parse the multicast data in network level. Our code(which you can see in below) works in lots of servers and its working without any problem. However recently we got a new server and installed the same operating system(Ubuntu 18.04) and same onload version (7.1.2.141) however when we ran our code in this server the udp receive queue never gets empty, its always full and we are not able to receive and parse the multicast data. I'm also sharing the network configuration below with our code. Does anyone have any idea about this problem
Code:
int onload_zc_recv(int fd, onload_zc_recv_args *args);
onload_zc_callback_rc zc_recv_callback(onload_zc_recv_args *args, int flags){
return clients[((zc_user_info*)(args->user_ptr))->id]->ZCRecvCB(args, flags);
}
onload_zc_callback_rc ItchClient::ZCRecvCB(onload_zc_recv_args *args, int flags) {
uint32_t i = 0;
for( i = 0; i < args->msg.msghdr.msg_iovlen; ++i ) {
if (args->msg.iov[i].iov_len > 0) {
//Our application logic is here
}
}
}
return ONLOAD_ZC_TERMINATE;
}
onload_zc_recv_args args;
memset(&args.msg, 0, sizeof(args.msg));
args.cb = &zc_recv_callback;
while (!clientStopped.load(std::memory_order_relaxed)) {
rc = onload_zc_recv(connection.getConnectionSocket(), &args);
}
Network Configuration: (We are trying to bind to ens1f0np0 interface)
br-80983172fc5d: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.19.0.1 netmask 255.255.0.0 broadcast 172.19.255.255
ether 02:42:73:86:7c:19 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
br-a85649ccece2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.18.0.1 netmask 255.255.0.0 broadcast 172.18.255.255
inet6 fe80::42:e7ff:fed4:6560 prefixlen 64 scopeid 0x20<link>
ether 02:42:e7:d4:65:60 txqueuelen 0 (Ethernet)
RX packets 3492 bytes 894736 (894.7 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3670 bytes 353542 (353.5 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:a7:6c:f9:da txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ens10f0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 68:05:ca:f3:9c:a2 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xb8500000-b85fffff
ens10f1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 68:05:ca:f3:9c:a3 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xb8400000-b84fffff
ens10f2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 68:05:ca:f3:9c:a4 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xb8300000-b83fffff
ens10f3: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 68:05:ca:f3:9c:a5 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xb8200000-b82fffff
ens1f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.46.54.133 netmask 255.255.255.224 broadcast 10.46.54.159
inet6 fe80::20f:53ff:fe9a:ef00 prefixlen 64 scopeid 0x20<link>
ether 00:0f:53:9a:ef:00 txqueuelen 1000 (Ethernet)
RX packets 220301 bytes 50255774 (50.2 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1792 bytes 236826 (236.8 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 18
ens1f1np1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 00:0f:53:9a:ef:01 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 19
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 9835 bytes 1610054 (1.6 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 9835 bytes 1610054 (1.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vethe21f54f: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::b488:c1ff:fee1:4029 prefixlen 64 scopeid 0x20<link>
ether b6:88:c1:e1:40:29 txqueuelen 0 (Ethernet)
RX packets 3492 bytes 943624 (943.6 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3685 bytes 354688 (354.6 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
This is our system log (Which has no error)
Mar 11 13:20:38 a1hft kernel: [ 1987.912719] oo:HftSrvProd[7]: Using Cloud Onload 7.1.2.141 [5,hft-udp-p7]
Mar 11 13:20:38 a1hft kernel: [ 1987.912720] oo:HftSrvProd[7]: Copyright 2019-2021 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks
I've also checked all the configurations with our currently working servers , but not able to find anything. Do you have any idea what may cause this problem