Using socket AF_PACKET / SOCK_RAW but tell kernel to not send RST

Question

My question has roughly been discussed here.

And the tl;dr solution is to do:

iptables -A OUTPUT -p tcp --tcp-flags RST RST -j DROP

And you could modify this to only block the port you're actively listening after.
But as mentioned in the above question, and here, these are not elegant solutions.

Now, I don't really care about the elegance of things.
But I do care about learning. So I've dug deep into the Linux source code (mostly interested in Linux based machines for now) and sorted through what I think is the socket.bind method in order to find anything related to instructing the kernel that "we" are actively monitoring a TCP port.

I assumed that the socket library informed the kernel some how about a specific port being bound to a application, this so that the kernel don't automatically respond with a RST packet to the client connecting, prompting a "Connection refused".

However, I find no such code in the source code.
Nor does the packet man page tell me anything about how to inform the kernel to ignore/accept packets coming in on a specific port.

I've got a pretty basic socket set up to listen with promiscuous mode (that is a tale of it's own).

However, my problem is that as soon as a client connects, on any given port - the kernel sends the incoming Ethernet+IP+TCP data frame my way as expected - but it also instantly sends out a response with "reversed" source and destination ports and the RST flag set. Which it should, but not on a specific port I tell it to. Problem is, how do I tell the kernel I'm monitoring a specific port?

One option would be to (as discussed in some other forums and on other various SO threads) - create a dummy socket on that port.

s = socket()
s.bind(('', <port>))

However, that causes a bunch of other problems (one of which is that this will have a buffer that will quickly be filled) and most importantly, still doesn't teach me how all this magic happens. The two above solutions are last resort if there's no other ways, but I feel I'm closer than ever yet more stuck than ever too in finding a proper solution to this problem.

The solution or tips could be in C as well, and/or a kernel module just for the sake of instructing the kernel of the needed information. I knew all this was going down in the kernel, and after the comments below and a much appreciate solution idea, I now get that there's no userspace function for this very thing. I could probably/easily port it to a cPython module or warp it around to Python code easily enough with a kernel module/extension. But I really have no idea where the kernel functions for setting these things are, or what they are called.

I've dug deep, far and shallow.. But it doesn't appear anyone else has had the need to do this. Mostly because a promiscuous socket is meant to pick up traffic and analyze it. But also the .bind((interface, protocol)) is out there and it faces the same issue, a way of not going into promiscuous mode but instead just receive TCP packets for instance by doing .bind((interface, 0x0800)).

I might be out on a limb here, but maybe man 7 netdevice just gave me an idea. I'm trying to setup SystemTap to check what calls ioctl() does and how socket() object asks for a file descriptor. Might be a clue as to how this all goes down. Tricky getting SystemTap to work tho.

Anyone have any other clues how to go about this problem or have bumped into this before?

Ps. Sorry for a fuzzy question, I have no idea what the correct terminology is for these lower level things. As they are quite new to me.

Edit: I might have been looking in the wrong place for bind(), according to the ipv4.af_inet implementation it will try to call the sockets bind() function, but if not, it will try to setup a lot of magic in here. And here they clank down on af_inet and does a lot of table junk.. I still haven't found a solution, but a step on the way perhaps... Or worst case, another goose hunt.

Going further down the rabbit hole, selinux/hooks.c contains some bind functionality as well. Maybe more security related, but still worth my investigation. Still no where near solving this darn riddle.

I don't think there's a "proper" solution to your problem. The kernel will process incoming packets for protocols that it implements (TCP/UDP, ICMP, SCTP and so on), and there is no knob/ioctl/socketoption/api call to turn that processing off - besides installing an iptables filter. — nos, Feb 20 '18 at 21:10
@nos My point is that the already implemented protocol handling of lets say TCP or UDP, has a way to tell the kernel to not mucker about with a certain port. Some how the kernel keeps track of the fact that a certain application (in this care it's more a registered file descriptor some where) - has full control of any data flow on a port. I want to tell the kernel the same thing, "don't mess about with this port, has it all figured out" - or something along these lines :) — Torxed, Feb 20 '18 at 21:18
No, not really. The protocol handling (of e.g. TCP or UDP) is handled in the kernel, not by the application or any socket library. So there are no entries anywhere that says that an application has full control of a port. There's an entry inside the kernel about the state of e.g. a TCP port. If any application is bound to a particular port, TCP packets destined for that port is handled entirely within the kernel, (and any TCP payload data after the kernel has processed the TCP protocol is sent up to the application bound to that port/socket.) — nos, Feb 20 '18 at 21:34
@nos I'm not a total stranger to the idea of writing a kernel module that I can call on just to do my bidding. that in mind, is this at all possible? Any idea where to being looking how to set this state? The only reason I'm doing this in Python is to do a mock up of the final product, a kernel module that communicates with userspace for now would work. The end product will most likely have to be a kernel module and c code anyway. — Torxed, Feb 20 '18 at 21:48
I don't think there's really any place a module could hook into to influence this, at least not somthing that iptables could not already do. The place TCP packets ends up and either processed further by an existing socket, or a RST is sent as noone is listening is here though: https://elixir.bootlin.com/linux/v4.13/source/net/ipv4/tcp_ipv4.c#L1645 — nos, Feb 20 '18 at 22:43
@nos That link gave me the exact starting point of where packets come in! I can absolutely work with this too! — Torxed, Feb 21 '18 at 07:37
The only two ways to do what you want (prevent the kernel from handling packets) are: 1) drive the interface yourself in userspace, or 2) prevent the kernel from sending packets from your socket (iptables/netfilter/etc). — Michael Foukarakis, Feb 21 '18 at 11:51
@Janith Sadly not, I ended up implementing my own implementation of `socket.c` to deal with the DHCP, TCP and IP stack. I never found a way to completely disable the kernel to **not** respond to unknown traffic. You could potentially use `iptables` and `ebtables` to block the kernel from responding, but such rules would *probably* quickly get complex. — Torxed, Aug 20 '20 at 06:37

score 2 · Answer 1 · answered Feb 20 '18 at 21:06

The problem here is that you're asking for a way to tell the kernel in effect "do these 100 things for me, but leave this one particular detail out." Frankly, I think the iptables solution is the easiest and cleanest.

Another option, though, is to not ask the kernel to do all those other bits, and instead to take on more work yourself. Specifically, make up your own IP address, and start using it. The only downside is that you have to take over another important thing that the kernel has been doing for you: responding to ARPs (ARP is used to discover the MAC [Ethernet] address of the station that owns a given IP address). In brief, I'm suggesting you:

Choose an unused IP address on your local subnet.
Make up a MAC address for your use. (Not strictly necessary but will make it easier to distinguish "your" traffic.)
Open a raw packet socket instead of raw IP socket (https://linux.die.net/man/7/packet).
Compose and send an ARP request to discover the MAC address of the station you're sending to (if on local LAN, else the MAC of the next hop [router] IP address).
Receive the ARP reply and record the other station's MAC.
Construct and send your SYN packet from your own MAC address to the MAC of the destination station. (With your chosen source and dest IPs, ports, etc.)
Listen for a return ARP for your IP and reply as needed.
Receive the SYN+ACK response. Since the destination IP address (the one you made up) is not known to the kernel to belong to your system, the kernel will not respond to the SYN+ACK with RST (or anything else).
Do whatever it is you want to do next...

You will of course have to be capturing promiscuously if you use a MAC address other than the one assigned to the interface. That is pretty typical with a raw packet socket. Also, you will be constructing Ethernet header, IP header, and TCP headers for all traffic (well, Ethernet + ARP for the ARP requests) so you will learn a lot.

I hear you, but the logical side of my brain goes "Doesn't traditional sockets (TCP, UDP etc) have the functionality to tell the kernel to alter the behavior of one port". I get that what I'm asking in general is inefficient and you are probably right that there is no one magical function to do these things. But where does the kernel keep track if there's a registered application for a specific port? and how is this executed. I'd be happy to modify and build a custom kernel to tweak this need. Side note: My "switch.py" has already implemented ARP, so this would mean no extra effort. — Torxed, Feb 20 '18 at 21:16
Used up way to many words there, I like your idea of making up my own IP address, it's not at all half bad actually! I can't stress how clever this workaround is. All be it not what I was looking for at all haha. But it does solve the problem in a rather unique way. I already construct Ethernet, IP and TCP headers. Along side with ARP/ICMP/OSPF support to name a few. I will all of a sudden need to support DHCP data as well, but that's manageable haha. I'll try my original train of thought first and see if i can solve it that way, if nothing else - this gets the solve mark! — Torxed, Feb 20 '18 at 21:17

Using socket AF_PACKET / SOCK_RAW but tell kernel to not send RST

1 Answers1

Linked