I am trying to transfers a packet from an interface to another by using raw packets (just for playing). First I focused on received packets.
On my machine (archlinux, that has 192.168.30.3 as IP) I created this code:
#include <stdio.h>
#include <net/ethernet.h> /* the L2 protocols */
#include <netinet/ip.h>
#include <netinet/tcp.h>
#include <arpa/inet.h>
int main()
{
int packet_socket = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_IP));
/* test reception */
char packet[4096];
struct sockaddr rcvaddr;
struct in_addr addr;
addr.s_addr = inet_addr("192.168.30.3"); //my ip
// use nc to send a use packet
while (1) {
int len = sizeof(rcvaddr);
int len_packet =
recvfrom(packet_socket, packet, 4096, 0, &rcvaddr, &len);
// check if the packet is for us
struct iphdr *iph =
(struct iphdr *) (packet + sizeof(struct ethhdr));
if (iph->daddr != inet_addr("192.168.30.3"))
continue;
// check if tcp
if (iph->protocol != IPPROTO_TCP)
continue;
printf("Total packet length: %d\n",
sizeof(struct ethhdr) + ntohs(iph->tot_len));
}
}
Then I run it as root and also execute nc -lp 12345 -n > /dev/null
.
On another machine (debian, 192.168.30.4) I run dd if=/dev/urandom | nc 192.168.30.3 12345
which makes my previous program prints the length of the received packets.
From it, I see there is packets that are greater that the size of MTU (which is 1500 on the two machines). For instance I can read "Total packet length: 16962" from my program. ( Also observed by linux raw ethernet socket receive more bytes than MTU).
I know about IP fragmentation so I first thought about IP reassembling.
However I read in man 7 raw
:
"Note that packet sockets don't reassemble IP fragments, unlike raw sockets."
Because I used packet sockets (AF_PACKET) I should not have packet
reassembling and then keep the MTU size right?
I also did sudo ethtool -K ens3 tx off sg off tso off
and test with the value 0,1,2 and 3
in /proc/sys/net/ipv4/ip_no_pmtu_disc on both machines.
Do you think 192.168.30.4 is sending more the MTU? Or does my machine performs some reassembling despite what is written in the manual?
ethtool -k ens3
gives:
On 192.168.30.4:
seb@SERVER:~$ sudo ethtool -k ens3
Features for ens3:
rx-checksumming: off
tx-checksumming: off
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: off
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: off
tx-scatter-gather: off
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off
generic-segmentation-offload: off [requested on]
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
On 192.168.30.3:
[seb@archlinux ~]$ sudo ethtool -k ens3
Features for ens3:
rx-checksumming: off
tx-checksumming: off
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: off
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: off
tx-scatter-gather: off
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: off [fixed]
generic-segmentation-offload: off [requested on]
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
tx-gso-list: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]
Also, note that the two machines are qemu machine run by GNS3 with the following net options: -net none -device e1000,mac=0c:7e:08:49:13:00,netdev=gns3-0 -netdev socket,id=gns3-0,udp=127.0.0.1:20049,localaddr=127.0.0.1:20048