0

I run client and socket server written in Go (1.12) on macOS localhost.

Server sets SetKeepAlive and SetKeepAlivePeriod on net.TCPConn.
Client sends a packet and then closes connection (FIN) or client abruptly terminated.

Tcpdump shows that even after client closes the connection, server keeps sending keep-alive probes.
Shouldn't it detect that peer is "dead" and close the connection?

The question is generic, feel free to clarify if I'm missing some basics.

package main

import (
    "flag"
    "fmt"
    "net"
    "os"
    "time"
)

func main() {
    var client bool
    flag.BoolVar(&client, "client", false, "")
    flag.Parse()

    if client {
        fmt.Println("Client mode")
        conn, err := net.Dial("tcp", "127.0.0.1:12345")
        checkErr("Dial", err)

        written, err := conn.Write([]byte("howdy"))
        checkErr("Write", err)

        fmt.Printf("Written: %v\n", written)
        fmt.Println("Holding conn")

        time.Sleep(60 * time.Second)

        err = conn.Close()
        checkErr("Close", err)

        fmt.Println("Closed conn")

        return
    }

    fmt.Println("Server mode")
    l, err := net.Listen("tcp", "127.0.0.1:12345")
    checkErr("listen", err)
    defer l.Close()

    for {
        c, err := l.Accept()
        checkErr("accept", err)
        defer c.Close()

        tcpConn := c.(*net.TCPConn)
        err = tcpConn.SetKeepAlive(true)
        checkErr("SetKeepAlive", err)
        err = tcpConn.SetKeepAlivePeriod(5 * time.Second)
        checkErr("SetKeepAlivePeriod", err)

        b := make([]byte, 1024)

        n, err := c.Read(b)
        checkErr("read", err)

        fmt.Printf("Received: %v\n", string(b[:n]))
    }
}

func checkErr(location string, err error) {
    if err != nil {
        fmt.Printf("%v: %v\n", location, err)
        os.Exit(-1)
    }
}
surlac
  • 2,961
  • 2
  • 22
  • 31
  • Please share some code to show us how you open and close connections and how your server is configured. – georgeok Jun 12 '19 at 21:10
  • 2
    You never close the connection on the server side. Your `defer c.Close()` is to be run when `main()` returns. – zerkms Jun 12 '19 at 21:16
  • Updated with some code. Everything is pretty simple there, the only configuration on the server is turning on keep-alive and setting its period. – surlac Jun 12 '19 at 21:16
  • @zerkms, it's a persistent connection, thus the point is keep it open as long as possible, and keep-alive supposed to identify if it's been (abruptly) terminated from the other end. That's why closing it at the very end. – surlac Jun 12 '19 at 21:19
  • 1
    @surlac what's your question again then? The connection is not closed because you don't close it. "1) What is the reason that it keeps sending probes?" --- you don't close the connection, so it's open and keepalive packets are sent. "2) how to properly terminate connection from client-side on Go level?" --- both parties should close the connection. – zerkms Jun 12 '19 at 21:20
  • So I see that keep-alive packets travel from server to client during wait period of 60s as expected. But if connection is closed from client side, server doesn't notice it and keeps sending keep-alive packets to the same client port. The question is - why. – surlac Jun 12 '19 at 21:22
  • 2
    Because that's how tcp works: `FIN` from client means "I won't send anything but still can receive" – zerkms Jun 12 '19 at 21:24
  • @zerkms, "both parties should close the connection" - this contradicts the purpose of keep-alive, because conn can be broken without parties noticing it for minutes or hours, which is the problem keep-alive tries to solve. – surlac Jun 12 '19 at 21:24
  • 2
    @surlac "this contradicts the purpose of keep-alive" --- it does not. What you observe works "as designed". – zerkms Jun 12 '19 at 21:25
  • @surfac If 'the point is to keep it open as long as possible', why is the client closing it?myour question doesn't make sense. – user207421 Jun 13 '19 at 01:21
  • @zerkms, keepalive in code above doesn't detect dead peers, while filipe claims that it does for him. Which is what I'm trying to understand. If you have any useful comments about it, feel free to chime in. – surlac Jun 14 '19 at 00:11
  • @user207421 it's a simplified example to check dead peers detection, you can abruptly terminate client to simulate dead peer. – surlac Jun 14 '19 at 00:11
  • @surlac it definitely detects as long as the peer is really unavailable: after timeout the server sends `RST` – zerkms Jun 14 '19 at 00:54

1 Answers1

2

The response to that question:

Sending keepalives is only necessary when you need the connection opened but idle. In that cases there is a risk that the connection is broken, so keep alive will try to detect broken connections.

If you had close the connection at server side with a proper con.Close() the keep alive would not be triggered (you did defer it to the end of the main function).

If you test your server code, it will start sending the keep alive after the timeout you set.

You notice that only after all keep alive proves (default 9 from kernel) and the time between the proves (8x), you get an io.EOF error on the server side Read (yes, the server stop sending)!

Currently the GO implementation is the same at Linux and OSX and it set both TCP_KEEPINTVL and TCP_KEEPIDLE to the value you pass to the setKeepAlivePeriod function, so, the behavior will depend of the kernel version.

func setKeepAlivePeriod(fd *netFD, d time.Duration) error {
    // The kernel expects seconds so round to next highest second.
    d += (time.Second - time.Nanosecond)
    secs := int(d.Seconds())
    if err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, secs); err != nil {
        return wrapSyscallError("setsockopt", err)
    }
    err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs)
    runtime.KeepAlive(fd)
    return wrapSyscallError("setsockopt", err)
}

There is a request opened since 2014 to provide a way to set keepalive time and interval separately.

Some references:

Community
  • 1
  • 1
filipe
  • 1,957
  • 1
  • 10
  • 23
  • +1. Appreciate answer. How do you detect that it sent 8 times? Does it return io.EOF on `c.Read(b)`? I ran in debian, it sent probes 15 times with interval 5 sec after 5 sec idle and then staying there, server didn't get EOF. Similar behavior on macOS - 27 times with 5 sec interval and 5 sec idle. Trying to understand what I'm doing differently. Thanks. – surlac Jun 12 '19 at 23:05
  • Yes, it return io.EOF on c.Read(b). The number of keepalive proves are set at kernel level. If you type ```sudo sysctl -A | grep keep``` you can read the values and if you look at go lang [source code](https://github.com/golang/go/search?p=1&q=setKeepAlivePeriod&unscoped_q=setKeepAlivePeriod) you can see the implementation and the parameters used. – filipe Jun 12 '19 at 23:53
  • I've gone through the articles you added. Still wondering why `Read` never returns `EOF` in my case, I see `FIN` returned back to server though. The only possible reason I see for explicitly terminating connection in your case - someone is sending `RST` (e.g. proxy). – surlac Jun 13 '19 at 23:58
  • @surlac that's by TCP design as well: to close a connection **both** parties must FIN + ACK, or either may RST. – zerkms Jun 14 '19 at 00:56
  • @surlac another thing worth mentioning: you only read from the connection **ONCE**, so no wondering you don't see `io.EOF` anywhere given you don't read anything. – zerkms Jun 14 '19 at 00:58