3

I have some code that makes tons of simple HTTP GET requests

The client that I am using is set up like this:

type client struct {
    *http.Client
    // other stuff in here
}

func NewClient() client {
    var c client
    c.Client = &http.Client{
        CheckRedirect: func(req *http.Request, via []*http.Request) error {
            req.Header.Set("User-Agent", c.userAgent)
            return nil
        },
        Transport: &http.Transport{
            Dial: func(network, addr string) (net.Conn, error) {
                return net.DialTimeout(network, addr, 2*time.Second)
            },
            TLSHandshakeTimeout: 2 * time.Second,
            TLSClientConfig: &tls.Config{
                InsecureSkipVerify: true,
            },
        },
        Timeout: 2 * time.Second,
    }
    return c
}

As you can see I am really trying to make sure I get timeouts on bad connections. I make the requests like this:

req, err := http.NewRequest("GET", url, nil)

Nothing out of the ordinary there. But after some time I get these goroutines building up and just blocking, an example after a panic to get the trace:

goroutine 325 [select, 4 minutes]:
net/http.(*persistConn).writeLoop(0xc208075130)
    /usr/local/go/src/net/http/transport.go:945 +0x41d
created by net/http.(*Transport).dialConn
    /usr/local/go/src/net/http/transport.go:661 +0xcbc

and

goroutine 418 [IO wait, 4 minutes]:
net.(*pollDesc).Wait(0xc2083c7870, 0x72, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc2083c7870, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc2083c7810, 0xc20857e000, 0x1000, 0x1000, 0x0, 0x7fbb634c3bb0, 0xc2084d87a0)
    /usr/local/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc208116020, 0xc20857e000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/net.go:121 +0xdc
net/http.noteEOFReader.Read(0x7fbb634c7288, 0xc208116020, 0xc208075b28, 0xc20857e000, 0x1000, 0x1000, 0x678100, 0x0, 0x0)
    /usr/local/go/src/net/http/transport.go:1270 +0x6e
net/http.(*noteEOFReader).Read(0xc2083d84c0, 0xc20857e000, 0x1000, 0x1000, 0xc208017600, 0x0, 0x0)
    <autogenerated>:125 +0xd4
bufio.(*Reader).fill(0xc20858c0c0)
    /usr/local/go/src/bufio/bufio.go:97 +0x1ce
bufio.(*Reader).Peek(0xc20858c0c0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/local/go/src/bufio/bufio.go:132 +0xf0
net/http.(*persistConn).readLoop(0xc208075ad0)
    /usr/local/go/src/net/http/transport.go:842 +0xa4
created by net/http.(*Transport).dialConn
    /usr/local/go/src/net/http/transport.go:660 +0xc9f

I've been trying to watch netstat and tcpdump to see what they're actually getting stuck on, but it isn't proving very useful right now. Before jumping into the source or getting more diligent with my monitoring I figured I'd toss this question out there. What is going on here?..

Also why are my timeouts not working? Is there another timeout that I need to set? (As you can see I just keep setting every one I can find, there is also the Response Header timeout which maybe I should set too? I thought the Timeout in the http.Client struct was pretty solid though and would timeout anything)

Lastly, is there a way to set the client port that I am missing so I can better monitor which connections are having an issue?

EDIT: For the record I am also pretty certain I am reading/closing the response body for every request. Unless there are some that are somehow hanging from the timeout or something I don't know of, but if that is the only solution someone sees I will look again.

user3591723
  • 1,224
  • 1
  • 10
  • 22
  • Perhaps the answer to http://stackoverflow.com/questions/17948827/reusing-http-connections-in-golang?rq=1 may be of help. – Sridhar Aug 04 '15 at 07:12
  • Hm, I don't know, it doesn't even seem like there was a conclusion in that question other than to be sure you read the whole thing, and I am actually using the method that answer specifies `io.Copy(ioutil.Discard, r.Body)` before calling `Close()` just to be certain it is all read – user3591723 Aug 04 '15 at 09:17
  • Is your program actually blocked somewhere? This just looks like the keep alive connections. – JimB Aug 04 '15 at 10:35
  • I was just noticing that now... Looks like the problem might actually be with a `sync.WaitGroup` ;) – user3591723 Aug 04 '15 at 10:55

1 Answers1

0

Turns out these weren't the problem, they appear to be just keep alive messages (although strangely I had KeepAlive: 0 set?). Misplaced Done() on a sync.WaitGroup was the real culprit. I guess I started to look at every Wait() as being called by the http.Client instead of the sync.WaitGroup.

user3591723
  • 1,224
  • 1
  • 10
  • 22
  • I assume when you say that you have `KeepAlive: 0`, you're referring to the setting in the network Dialer, since there's no setting like that in http. That is for tcp keepalive, which is something entirely different from http keepalive, which are just connections that can be reused. – JimB Aug 04 '15 at 13:50
  • Oh, thanks for that, didn't know that, I did mean the Dialer. – user3591723 Aug 05 '15 at 00:37