We're developing an online game where players communicate with the server using a persistent TCP connection. Persistent as in, its lifetime is that of a player's session, and if the connection is closed, the player is thrown from the game (though the client will attempt to automatically reconnect).
Problem
Now, of course everything works fine in our office (connecting to both testing and live servers), but our client reports that some players get disconnected a lot (every few seconds), and that they experience it themselves too (though their offices are in the same building).
Question
How can I find out the cause of these disconnects? Is it because:
- Players have bad internet connections and it can't be helped.
- The distance between players and server (Turkey <-> Netherlands) is too long.
- Something is wrong with the server (a CentOS machine) or the datacenter.
- The server is overloaded (though it happens under low loads too).
- There is an error in our software.
- Or some other reason?
The software is written in Java. It logs when players are disconnected, and if it actively kicks them (e.g. for not sending keep-alive messages) it logs that too.
Known data
- Whenever a spurious disconnect is reported and I check the logs, most of the time I don't see that player getting actively kicked by the server software, only see that the connection has been closed.
- There is an internal monitoring service which has a bunch of localhost connections to the game server, the same way players do, and it doesn't get disconnected.
Others
There are many other online games like ours. How do they deal with this? (Unless the problem is in the server/datacenter, then the solution is obvious)
- Do they use UDP? I know action games do, for speed, but I presume TCP is normal for e.g. online poker and other slow games? (Not that that would help us, our client software is made in Flash, which doesn't support UDP)
- Is there some TCP tweaking that can be done to make it more lenient?
- Or do they get these disconnects as well, just reconnect more transparently?
- Is there information about this on the web?