3

I am using Netty both server side and client side to establish and control a websocket connection. I have in the server side an IdleStateHandler which will send a user event for when the channel reader, writer, or both have been idle for a certain period of time. I had it so that the writer idle event would be triggered after 5 minutes of being idle, and the reader idle event would be triggered after 6 minutes of being idle. During the writer idle event, the server would send a ping frame to the client which would reset the writer idle time and as well as the reader idle time once the pong frame is received from the client.

The problem is that the netty client seems to not read any new frames after 5 minutes of being idle. I did some status checks on the channel in the client to see if it was writable, registered, open, and active after that 5 minute idle period, and all states were true but new frames were not being read. To solve the issue, I simply just changed the IdleStateHandler times in the server side to 3 minutes rather than 5 so that the client would receive a ping frame and respond with a pong frame before the 5 minutes of being idle.

But this does not solve the underlying problem. I want to understand and be able to control when the client's reader goes idle and be able to prevent future problems with lost or unread data. Looking at the code below, the idle event handler will close the channel connection if no pong or heartbeat frame was received from the client, but since the client does not read new frames, it never gets the close frame, and so the server thinks the client is not connected, and the client thinks it is connected which obviously causes problems. Is there any way to gain more control over this magical 5 minute timeout in the client side using Netty? I could not find anything about this in the documentation or source.

Here is the related idle event handling code in the server:

private class ConnectServerInitializer extends ChannelInitializer<SocketChannel> {

    private final IdleEventHandler idleEventHandler = new IdleEventHandler();
    private final SslContext sslCtx;

    private ConnectServerInitializer(SslContext sslCtx) {
        this.sslCtx = sslCtx;
    }

    @Override
    public void initChannel(SocketChannel ch) throws Exception {
        ChannelPipeline pipeline = ch.pipeline();
        if (sslCtx != null) {
            pipeline.addLast(sslCtx.newHandler(ch.alloc()));
        }
        pipeline.addLast(new HttpServerCodec());
        pipeline.addLast(new HttpObjectAggregator(65536));
        pipeline.addLast(idleEventHandler.newStateHandler());
        pipeline.addLast(idleEventHandler);
        pipeline.addLast(getHandler());
    }

}

@Sharable
private class IdleEventHandler extends ChannelDuplexHandler {

    private static final String HEARTBEAT_CONTENT = "--heartbeat--";
    private static final int READER_IDLE_TIMEOUT = 200; // 20 seconds more that writer to allow for pong response
    private static final int WRITER_IDLE_TIMEOUT = 180; // NOTE: netty clients will not read frames after 5 minutes of being idle
    // This is a fallback for when clients do not support ping/pong frames
    private final AttributeKey<Boolean> USE_HEARTBEAT = AttributeKey.valueOf("use-heartbeat");

    @Override
    public void userEventTriggered(ChannelHandlerContext ctx, Object event) throws Exception {
        if (event instanceof IdleStateEvent) {
            IdleStateEvent e = (IdleStateEvent) event;
            Boolean useHeartbeat = ctx.attr(USE_HEARTBEAT).get();
            if (e.state() == IdleState.READER_IDLE) {
                if (useHeartbeat == null) {
                    logger.info("Client " + ctx.channel() + " has not responded to ping frame. Sending heartbeat message...");
                    ctx.attr(USE_HEARTBEAT).set(true);
                    sendHeartbeat(ctx);
                } else {
                    logger.warn("Client " + ctx.channel() + " has been idle for too long. Closing websocket connection...");
                    ctx.close();
                }
            } else if (e.state() == IdleState.WRITER_IDLE || e.state() == IdleState.ALL_IDLE) {
                if (useHeartbeat == null || !useHeartbeat) {
                    ByteBuf ping = Unpooled.wrappedBuffer(HEARTBEAT_CONTENT.getBytes());
                    ctx.writeAndFlush(new PingWebSocketFrame(ping));
                } else {
                    sendHeartbeat(ctx);
                }
            }
        }
    }

    private void sendHeartbeat(ChannelHandlerContext ctx) {
        String json = getHandler().getMessenger().serialize(new HeartbeatMessage(HEARTBEAT_CONTENT));
        ctx.writeAndFlush(new TextWebSocketFrame(json));
    }

    private IdleStateHandler newStateHandler() {
        return new IdleStateHandler(READER_IDLE_TIMEOUT, WRITER_IDLE_TIMEOUT, WRITER_IDLE_TIMEOUT);
    }
}
Jon McPherson
  • 2,495
  • 4
  • 23
  • 35
  • Are you testing the application with localhost ip addresses, or over the internet? – Ferrybig Feb 01 '16 at 08:57
  • @Ferrybig it works perfectly on localhost, but once I deploy to my webserver and test over public IP, this idle problem occurs – Jon McPherson Feb 01 '16 at 17:31
  • This hints me at an firewall / nat timeout rather than a code problem , can you check the routers nat timeout if it is under your control? – Ferrybig Feb 01 '16 at 17:33

1 Answers1

1

Your problem is related to the time out of your firewall. Some firewall have a timeout near 5 minutes, and if this timeout is exceeded, the connection is silently dropped. Because of this, bot the client and server need to have some some of read timeout out to check this fact, and either the server, client or both have some sort of ping messages. The firewall problem will be less when you run your protocol over IPv6 as most IPv6 firewalls are mainly stateless and usually don't change the port of the connection, so a packet from the client reactivates the entry in firewall again.

When you have many moments of 5 minute timeouts, you should consider if the extra load from the websockets can be compared to the load of a simple polling http loop every 1 minute, as this creates less memory strain on the server.

Ferrybig
  • 18,194
  • 6
  • 57
  • 79