TCP based hole punching

Question

I've been attempting TCP hole punching for a while now and forums don't seem be helping much when it comes to TCP based approach and C programming language. Following were the main references from internet,

a. http://www.brynosaurus.com/pub/net/p2pnat/
b. https://wuyongzheng.wordpress.com/2013/01/31/experiment-on-tcp-hole-punching/

My setup is
Client A -- NAT-A -- Internet -- NAT-B -- Client B.

Assuming that client A knows B's public and private endpoint, and B knows A's endpoints ( I have written a server 'S' that exchanges endpoint information among peers), and given that both the NATs are NOT symmetric, will it suffice (to achieve TCP hole punching), if both the clients attempt to connect() to each other's public endpoint ( for the above setup) repeatedly?

If not, what exactly has to be done to achieve tcp hole punching?

I have two threads on each clients , one that makes a connect call repeatedly to other client, and the other that listens to incoming connection from other client. I have made sure that sockets in both the threads are bound to the local port that was given to the peer. Also, I see that both the NATs preserve port mapping i.e., local and public ports are same. Yet, my program isn't working.

Is it so that the rendezvous server 'S' that I mentioned above has a role to play in punching a hole or creating a NAT mapping that will allow SYN requests to pass through, to the peers. If yes, what has to be done?

Relevant sections of the code are attached.
connect_with_peer() is the entry point, after server 'S' provides the peer's public ip:port tuple, which is given to this function along with local port to which binding is done. This function spawns a thread ( accept_handler() ) which also binds to the local port and listens for incoming connection from the peer. connect_with_peer() returns a socket , if connect() [ main thread ] or accept() [ child thread ], is successful.

Thanks,
Dinkar

volatile int quit_connecting=0;

void *accept_handler(void *arg)
{
    int i,psock,cnt=0;
    int port = *((int *)arg);
    ssize_t len;
    int asock,opt,fdmax;
    char str[BUF_SIZE];
    struct sockaddr_in peer,local;
    socklen_t peer_len = sizeof(peer);
    fd_set master,read_fds;    // master file descriptor list
    struct timeval tv = {10, 0}; // 10 sec timeout
    int *ret_sock = NULL;
    struct linger lin;
    lin.l_onoff=1;
    lin.l_linger=0;

    opt=1;
    //Create socket
    asock = socket(AF_INET , SOCK_STREAM, IPPROTO_TCP);

    if (asock == -1)
    {
        fprintf(stderr,"Could not create socket");
        goto quit_ah;
    }
    else if (setsockopt(asock, SOL_SOCKET, SO_LINGER, &lin,
                        (socklen_t) sizeof lin) < 0)
    {
        fprintf(stderr,"\nTCP set linger socket options failure");
        goto quit_ah;
    }
    else if (setsockopt(asock, SOL_SOCKET, SO_REUSEADDR | SO_REUSEPORT, &opt,
                        (socklen_t) sizeof opt) < 0)
    {
        fprintf(stderr,"\nTCP set csock options failure");
        goto quit_ah;
    }


    local.sin_family = AF_INET;         /* host byte order */
    local.sin_port = htons(port);     /* short, network byte order */
    local.sin_addr.s_addr = INADDR_ANY; /* auto-fill with my IP */
    bzero(&(local.sin_zero), 8);        /* zero the rest of the struct */

fprintf(stderr,"\naccept_handler: binding to port %d",port);

    if (bind(asock, (struct sockaddr *)&local, sizeof(struct sockaddr)) == -1) {
        perror("accept_handler bind error :");
        goto quit_ah;
    }

    if (listen(asock, 1) == -1) {
        perror(" accept_handler listen");
        goto quit_ah;
    }

    memset(&peer, 0, sizeof(peer));
    peer.sin_addr.s_addr = inet_addr(peer_global_address);
    peer.sin_family = AF_INET;
    peer.sin_port = htons( peer_global_port );

    FD_ZERO(&master);    // clear the master and temp sets
    FD_SET(asock, &master);
    fdmax = asock; // so far, it's this one

    // Try accept
    fprintf(stderr,"\n listen done; accepting next ... ");

    while(quit_connecting == 0){
        read_fds = master; // copy it
        if (select(fdmax+1, &read_fds, NULL, NULL, &tv) == -1) {
            perror("accept_handler select");
            break;
        }
        // run through the existing connections looking for data to read
        for(i = 0; i <= fdmax; i++) {
            if (FD_ISSET(i, &read_fds)) { // we got one!!
                if (i == asock) {
                    // handle new connections
                    psock = accept(asock, (struct sockaddr *)&peer, (socklen_t*)&peer_len);

                    if (psock == -1) {
                        perror("accept_handler accept");
                    } else {
                        fprintf(stderr,"\n Punch accept in thread succeeded soc=%d....",psock);
                        quit_connecting = 1;

                        ret_sock = malloc(sizeof(int));
                        if(ret_sock){
                            *ret_sock = psock;
                        }

                    }
                }
            }
        } // end for
    }


quit_ah:

    if(asock>=0) {
        shutdown(asock,2);
        close(asock);
    }
    pthread_exit((void *)ret_sock);

    return (NULL);
}



int connect_with_peer(char *ip, int port, int lport)
{
    int retval=-1, csock=-1;
    int *psock=NULL;
    int attempts=0, cnt=0;
    int rc=0, opt;
    ssize_t len=0;
    struct sockaddr_in peer, apeer;
    struct sockaddr_storage from;
    socklen_t peer_len = sizeof(peer);
    socklen_t fromLen = sizeof(from);
    char str[64];
    int connected = 0;
    pthread_t accept_thread;
    long arg;
    struct timeval tv;
    fd_set myset;
    int so_error;

    struct linger lin;
    lin.l_onoff=1;
    lin.l_linger=0;

    opt=1;

    //Create socket
    csock = socket(AF_INET , SOCK_STREAM, IPPROTO_TCP);

    if (csock == -1)
    {
        fprintf(stderr,"Could not create socket");
        return -1;
    }
    else if (setsockopt(csock, SOL_SOCKET, SO_LINGER, &lin,
                        (socklen_t) sizeof lin) < 0)
    {
        fprintf(stderr,"\nTCP set linger socket options failure");
    }

#if 1
    else if (setsockopt(csock, SOL_SOCKET, SO_REUSEADDR | SO_REUSEPORT, &opt,
                        (socklen_t) sizeof opt) < 0)
    {
        fprintf(stderr,"\nTCP set csock options failure");
    }
#endif

    quit_connecting = 0;

///////////

    if( pthread_create( &accept_thread , NULL ,  accept_handler , &lport) < 0)
    {
        perror("could not create thread");
        return 1;
    }
    sleep(2); // wait for listen/accept to begin in accept_thread.

///////////
    peer.sin_family = AF_INET;         /* host byte order */
    peer.sin_port = htons(lport);     /* short, network byte order */
    peer.sin_addr.s_addr = INADDR_ANY; /* auto-fill with my IP */
    bzero(&(peer.sin_zero), 8);        /* zero the rest of the struct */

fprintf(stderr,"\n connect_with_peer: binding to port %d",lport);

    if (bind(csock, (struct sockaddr *)&peer, sizeof(struct sockaddr)) == -1) {
        perror("connect_with_peer bind error :");
        goto quit_connect_with_peer;
    }

    // Set non-blocking 
    arg = fcntl(csock, F_GETFL, NULL); 
    arg |= O_NONBLOCK; 
    fcntl(csock, F_SETFL, arg); 

    memset(&peer, 0, sizeof(peer));
    peer.sin_addr.s_addr = inet_addr(ip);
    peer.sin_family = AF_INET;
    peer.sin_port = htons( port );

    //Connect to remote server
    fprintf(stderr,"\n Attempting to connect/punch to %s; attempt=%d",ip,attempts);
    rc = connect(csock , (struct sockaddr *)&peer , peer_len);

    if(rc == 0){ //succeeded
        fprintf(stderr,"\n Punch Connect succeeded first time....");
    } else { 
        if (errno == EINPROGRESS) { 


            while((attempts<5) && (quit_connecting==0)){
            tv.tv_sec = 10; 
            tv.tv_usec = 0; 
            FD_ZERO(&myset); 
            FD_SET(csock, &myset); 
                if (select(csock+1, NULL, &myset, NULL, &tv) > 0) { 

                    len = sizeof(so_error);
                    getsockopt(csock, SOL_SOCKET, SO_ERROR, &so_error, (socklen_t *)&len);

                    if (so_error == 0) {
                        fprintf(stderr,"\n Punch Connect succeeded ....");
                        // Set it back to blocking mode
                        arg = fcntl(csock, F_GETFL, NULL); 
                        arg &= ~(O_NONBLOCK); 
                        fcntl(csock, F_SETFL, arg);

                        quit_connecting=1;
                        retval = csock;
                    } else { // error
                        fprintf(stderr,"\n Punch select error: %s\n", strerror(so_error));
                        goto quit_connect_with_peer;
                    }

                } else { 
                    fprintf(stderr,"\n Punch select timeout: %s\n", strerror(so_error));
                } 
                attempts++;
            }// end while

        } else { //errorno is not EINPROGRESS
            fprintf(stderr, "\n Punch connect error: %s\n", strerror(errno)); 
        } 
    } 

quit_connect_with_peer:

    quit_connecting=1;
    fprintf(stderr,"\n Waiting for accept_thread to close..");
    pthread_join(accept_thread,(void **)&psock);

    if(retval == -1 ) {
        if(psock && ((*psock) != -1)){
            retval = (*psock); // Success from accept socket
        }
    }

    fprintf(stderr,"\n After accept_thread psock = %d csock=%d, retval=%d",psock?(*psock):-1,csock,retval);

    if(psock) free(psock); // Free the socket pointer , not the socket.

    if((retval != csock) && (csock>=0)){ // close connect socket if accept succeeded
        shutdown(csock,2);
        close(csock);
    }

    return retval;
}

score 7 · Answer 1 · edited Oct 07 '21 at 08:46

First, read this very similar question:
TCP Hole Punching

And read the part after EDIT2 (excerpt here). That's possibly the cause of failure.

Once the second socket has successfully bound, the behavior for all sockets bound to that port is indeterminate.

Don't worry linux has similar limitations in socket(7) with SO_REUSEADDR:

For AF_INET sockets this means that a socket may bind, except when there is an active listening socket bound to the address. When the listening socket is bound to INADDR_ANY with a specific port then it is not possible to bind to this port for any local address

I don't think that listening after instead of before will make a difference.

You don't have to try and open twice your connection.

Summary of steps to establish a TCP connection: Left side: (a client C connecting to a server S) is the usual case, right side is the simultaneous connection of two peers A and B (what you're trying to do):

C                           A       B
  \ (SYN)                     \   /
   \                      (SYN)\ /(SYN)
     > S                        X
    /                          / \
   /(SYN+ACK)                 /   \
  /                       A <       > B
C<                            \   /
  \                   (SYN+ACK)\ / (SYN+ACK)
   \(ACK)                       X
    \                          / \
     \                        /   \
      > S                  A <     > B 
 ESTABLISHED               ESTABLISHED

references:
https://www.rfc-editor.org/rfc/rfc793#section-3.4 figure 8.

correction for fig 8 line 7:
https://www.rfc-editor.org/rfc/rfc1122#page-87 (section 4.2.2.10)

The difference is the simultaneous SYN2/SYN+ACK2 instead of SYN/SYN+ACK/ACK (in my tests with two linux peers, usually only the "first" answers with SYN+ACK because it's never that simultaneous. It doesn't really matter).

Both peers actively initiate a connection. They're not initially waiting for a connection and you don't have to call listen()/accept() at all. You don't have to use any threads at all.

Each peer should exchange (through S) their intended local port for the other to use (and with the help of S they'll exchange their public IP), with the assumption the port won't be translated.

Now you just try and connect with your 4-uple of informations. each will binds with (INADDR_ANY,lport) and connect to (peer_global_address,peer_global_port) while simultanously B does the same. At the end there is an UNIQUE connection established between both sides.

both NAT boxes will see outgoing packets and prepare a reverse path.

Now what can go wrong?

A NAT box can't cope with the expected packet having a SYN instead of the more common SYN+ACK. Sorry, if that happens you might be out of luck. TCP protocol allows for this case and it's mandatory (rfc 1122 section 4.2.2.10 above). If the other NAT box is fine it should still work (once a SYN+ACK is sent back).
A NAT device (from a peer doing the request too late, say NAT-B in front of B) answers with a RST packet instead of silently dropping the still unknown packet like most NAT devices are doing. A receives RST and aborts the connection. Then B sends it and a similar fate happens. The faster the ping round-trip, the easier you'd get this. To avoid this, either:
- if you can control one of the NAT devices, have it drop the packet instead of sending a RST.
- be really synchronized (use NTP, exchange a precise date in sub-milliseconds of intended action between the peers through S, or wait the next multiple of 5 seconds to start)
- drop the outgoing RST packet with a custom (and temporary) firewall rule on A and/or B (better than dropping the incoming RST, because the NAT devices can decide to close the expectation when they see it)

I can just tell I could have TCP hole punching working reliably "by hand" simply using netcat between two peers set like in your case.

Eg on Linux with netcat: type simultaneously those on the two peers A and B each in a private LAN behind their NAT device. With usual NAT devices (which drop unknown packets), no need for any perfect synchronization, even 5s between those two commands is fine (of course the first will be waiting):

host-a$ nc -p 7777 public-ip-host-b 8888
host-b$ nc -p 8888 public-ip-host-a 7777

When it's done, both netcat have established the SAME UNIQUE connection together, there aren't two connections established. No retry (no loop) was needed. Of course the programs will have used a connect(), and the OS may have sent multiple SYN packets as an automatic retry mechanism during the connect() if the second command (and thus connect() ) is delayed. This is at system/kernel level, not at your level.

I hope this helped so you can simplify your program and have it work. Remember, no need to listen(), accept(), having to fork, use threads. You don't even need a select(), just have connect() block normally without O_NONBLOCK.

Firstly , Thanks a lot A.B, for the descriptive answer that gave me a lot more insight. Update: I replaced the NAT box on one side, and things started working !! Yet to figure out what was it, that this NAT wasn't happy about. Some questions. You mentioned that listen is not needed at all. If my setup is, A -- NAT-A -- internet -- B I could see that accept succeeded in this case, and NOT having the listener thread didn't work out. Well, I can't assume that NATs will always be present. Do you think that this should work with connects only. Thanks again. Dinkar — dinkar_mdn, Sep 21 '16 at 05:15
Continuation from prev comment: If client A and B are on different subnets ( say below 2 NATs) within an organisation, will the same logic work if they attempt to connect on private endpoints. I've been trying it these days, and no success so far. — dinkar_mdn, Sep 21 '16 at 05:16
My explaination was mostly for you to understand there must be no accept but only a connect. So indeed you must not use accept and I wrote it already. If you got an accept to work it's perhaps because you succeeded in confusing a NAT box. But that's not the correct way. I tested what I wrote and it's really working without accept, nor any retry loop. Just a connect from each side. As for an organization it's usually firewalled (a NAT is not a firewall, it happens to have some firewall properties) so maybe source port 80 as well as destination port on both sides... and there are logs... — A.B, Sep 22 '16 at 06:47
And don't think because a place tells to use connect+accept, that it's correct. If you read the comments on what you linked you'll see it doesn't work as intended — A.B, Sep 22 '16 at 06:52
Agreed. If there are NATs ( NAT-A and NAT-B) on both ends, accept is never required, and I have seen it in my program as well. But if one of NAT-A or NAT-B is NOT present, then "connect only on both ends" wasn't working reliably and it needed a "connect + accept". Anyway, I appreciate your answers, and thanks for taking time to reply to my queries. — dinkar_mdn, Sep 24 '16 at 05:13
It wasn't working reliably because you get a RST when there's no NAT (usually NAT boxes are configured to never reply with RST and just ignore/DROP). If you were using a synchronized method, or dropped RSTs it would be more reliable that your accept that will disrupt its own connect. Especially dropping outgoing RST on the non-NAT side, that you surely must have control to, that would match the behaviour of a NAT box. By the way if you feel my answer was a solution feel free to accept it as thus. — A.B, Oct 02 '16 at 11:17

TCP based hole punching

1 Answers1