2

I have a game server in C++ and I'm using a network library that uses winsock in Windows. I've been stress-testing my server to see how many connections it can accept at a time. It works fine when I connect using my game clients but my game clients can no longer connect after I do a stress-test described below.

The stress test is, I connected to my server about 1000 times using a simple program for loop that just starts a tcp connection with my game server and closes it right away. They all connect. Then, after, I try to connect with my game. The game does not connect at all.

I checked the tcpaccept() function from the library (see below), no output. For some reason, accept() stops accepting connections after my "attack" of 1000 connections. What could possibly make my server just stop accepting connections?

Here's my summary of my loop that listens and accepts connections and closes them:

bool serverIsOn = true;
double listen = tcplisten(12345, 30000, 1);
setnagle(listen, true);

...

while(serverIsOn){
    double playerSocket = tcpaccept(listen, 1);
    if(playerSocket > -1){
        cout << "Got a new connection, socket ID: " << playerSocket << endl;

        //add their sockID to list here!
        addSockIDToList(playerSocket);

    }

    //Loop through list of socks and parse their messages here..
    //If their message size == 0, we close their socket via closesocket(sockID);
    loopThroughSocketIdsAndCloseOnLeave();
}

cout << "Finished!" << endl;

Here's the definitions for tcplisten, tcpaccept, CSocket::CSocket(SOCKET), CSocket::tcplisten(...) and CSocket::tcpaccept(...):

double tcplisten(int port, int max, int mode)
{
    CSocket* sock = new CSocket();
    if(sock->tcplisten(port, max, mode))
        return AddSocket(sock);
    delete sock;
    return -1;
}

double tcpaccept(int sockid, int mode)
{
    CSocket*sock = (CSocket*)sockets.item(sockid);
    if(sock == NULL)return -1;
    CSocket*sock2 = sock->tcpaccept(mode);
    if(sock2 != NULL)return AddSocket(sock2);
    return -1;
}

...

CSocket::CSocket(SOCKET sock)
{
    sockid = sock;
    udp = false;
    format = 0;
}

bool CSocket::tcplisten(int port, int max, int mode)
{
    if((sockid = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == INVALID_SOCKET) return false;
    SOCKADDR_IN addr;
    addr.sin_family = AF_INET;
    addr.sin_addr.s_addr = INADDR_ANY;
    addr.sin_port = htons(port);
    if(mode)setsync(1);
    if(bind(sockid, (LPSOCKADDR)&addr, sizeof(SOCKADDR_IN)) == SOCKET_ERROR)
    {
        closesocket(sockid);
        return false;
    }
    if(listen(sockid, max) == SOCKET_ERROR)
    {
        closesocket(sockid);
        sockid = INVALID_SOCKET;
        return false;
    }
    return true;
}


CSocket* CSocket::tcpaccept(int mode)
{
    if(sockid == INVALID_SOCKET) return NULL;
    SOCKET sock2;
    if((sock2 = accept(sockid, (SOCKADDR *)&SenderAddr, &SenderAddrSize)) != INVALID_SOCKET)
    {
        //This does NOT get output after that 1000-'attack' test.
        std::cout << "Accepted new connection!" << std::endl;
        CSocket*sockit = new CSocket(sock2);
        if(mode >=1)sockit->setsync(1);
        return sockit;
    }

    return NULL;
}

What can I do to figure out why accept() no longer accepts connections after my 1000-connection stress test? Does it have something to do with the way I close connections after their finished? When I do that, all I do is just call: closesocket(sockID).

Please ask for any other code needed!

EDIT: I just noticed that my "stress-test" java program is getting an exception after its connected around 668 times. Here's the exception:

Exception in thread "main" java.net.ConnectException: Connection refused: connect
    at java.net.DualStackPlainSocketImpl.connect0(Native Method)
    at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at java.net.Socket.<init>(Socket.java:425)
    at java.net.Socket.<init>(Socket.java:208)
    at sockettest.SocketTest.main(SocketTest.java:63)
Java Result: 1
Joe Bid
  • 465
  • 8
  • 24
  • I'm not sure if this is the problem but I also noticed that when someone connects to the server, the sockID increments by one each time. This library does not seem to recycle socket IDs (not sure if thats how winsock should work). For example, if a players sockID is 2 and they leave, the next player to join's sockID will be 3 instead of 2. – Joe Bid May 23 '15 at 20:19
  • Unlike other platforms, Windows does not identify a socket using an ID number. A socket is an actual object in the kernel, and thus is represented using an object handle. That handle MAY be reused once a socket has been closed. It is a common mistake for people to forget to invalidate a socket handle (set it to `INVALID_SOCKET`) after closing the socket, then later they act on the not-invalid socket handle and end up acting on a completely different socket that happened to be using the same handle value. That is why you should ALWAYS invalidate a handle after you close it. – Remy Lebeau May 23 '15 at 20:35
  • `connect()` can fail with a "connection refused" error for several different reasons, but the most common (assuming a firewall is not blocking the connection) is that either the server socket is not listening on the port, or the server socket's backlog of pending connections is full (which would imply that `accept()` is not being called often enough, if at all). – Remy Lebeau May 23 '15 at 20:37
  • Hello again Remy, what is the correct code to invalidate a socket ID handle? I'm still trying to figure out how I'd do it with this library. Also, I'm calling accept() every iteration to see if a connection came through. It's definitely being called. And how do I check if the backlog is full, and what would be the correct way to empty it when I close a socket? – Joe Bid May 23 '15 at 21:08
  • I said how in my last comment (set the socket handle to `INVALID_SOCKET`), eg: `closesocket(sockid); sockid = INVALID_SOCKET;` You are doing that in `CSocket::tcplisten()` when `listen()` fails, but not when `bind()` fails. – Remy Lebeau May 23 '15 at 21:14
  • The backlog contains clients that are waiting to be accepted. `accept()` returns a pending client from the backlog. A socket operates in blocking mode by default, so your work loop can be blocked on `accept()` when there is no pending client, thus `loopThroughSocketIdsAndCloseOnLeave()` would not be called in a tiimely manner, unless you use non-blocking sockets, or use `select()` to know when to call `accept()`, or move `loopThroughSocketIdsAndCloseOnLeave()` to another thread. – Remy Lebeau May 23 '15 at 21:18
  • Well I've now set the sockid to INVALID_SOCKET once bind() fails but unfortunately accept() still will not accept my game after my stress-test. I'm really not sure what to do at this point. I've also tried combinations of setting mode on tcplisten() and tcpaccept() to both 0 and 1 but those aren't working either. – Joe Bid May 23 '15 at 21:25
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/78614/discussion-between-joe-bid-and-remy-lebeau). – Joe Bid May 23 '15 at 21:33

2 Answers2

1

Because your server side is closing the sockets, they are most likely sitting in time_wait for several minutes. Windows has various parameters controlling maximum sockets and various states. I am guessing your program starts working again after several minutes, and potentially there are some warnings in event viewer.

An alternative might be to simply ignore these sockets for several minutes and hope they go away. ie the client calls closesocket when you dont respond at all, which means you do not incur time_wait. This often works but not always. If they do not, then you call closesocket() slowly on them in the background.

If you really want too though, you can reset the connection, see TCP option SO_LINGER (zero) - when it's required for details, but reseting connections is not normal so definitely read widely about So_linger and how tcp teardown works.

Community
  • 1
  • 1
rlb
  • 1,674
  • 13
  • 18
  • I actually waited about 20 minutes after my "attack" and I still couldn't connect to my server via my game client. I didn't put it in my code also but if I get a connection from a client that after a set amount of time does not send me a response, I consider that client not part of the game and I ban them. However, they can still get a connection accepted via tcpaccept(). I just close their socket afterwards. How do I ignore them or reject them all together from connecting via accept() ? – Joe Bid May 23 '15 at 21:04
  • @JoeBid: Why would you "ban" a client that times out? What if their network connection simply went down and then came back up? In any case, you cannot reject a client connection before `accept()` has accepted it first. You will just have to close it immediately. The only way to reject a connection without `accept()` accepting it is to not use `accept()` at all, switch to `WSAAccept()` instead (which is a Windows-specific extension to WinSock). – Remy Lebeau May 23 '15 at 21:23
  • Oh no I'm not banning clients that time out I'm banning clients that are not returning correct responses that my game client should be responding with. Such as my stress test program. All they do is connect and close their connection. But the thing is, can WSAAccept() tell me whether or not the game client connection is being refused and why? What if the server can no longer get connections at all? Would WSAAccept() even get anything after my stress test? (looking up how to use it now). I'm really not sure what I would do with this new WSAAccept() function to solve this problem. – Joe Bid May 23 '15 at 21:27
  • Any chance you are closing the listen socket? It looks like it is in the same list as normal sockets? After you've run a stress test, does netstat -na still show listening on port 12345. This would explain connrefused. – rlb May 23 '15 at 21:39
  • Hm well after I do my stress test port 12345 does not appear on that list anymore! Any ideas on why that might happen? I never close the listening socket until the program has ended (after the while loop). – Joe Bid May 23 '15 at 22:34
  • In your close routine put an explicit check for port 12345 and never close it. For it to be gone there must be a path in code to close it, it wont disappear by itself. – rlb May 23 '15 at 22:41
0

It turns out this library has it's own method of closing a socket:

int closesock(int sockid)
{
    CSocket*sock = (CSocket*)sockets.item(sockid);
    if(sock == NULL)return -1;
    delete sock;
    sockets.set((int)sockid, NULL);
    return 1;
}

So it gets the current socket via the sockID in the list of sockets. Then if the sockID was related to a valid socket, delete the sock object and set it to NULL in the list of sockets.

The problem was I was only calling closesocket(sockID) instead of closesock(sockID) which performed the necessary operations needed to close a socket.

Thanks everyone for your help.

Joe Bid
  • 465
  • 8
  • 24