3

I have the following basic http server in Go. For every incoming request it posts 5 outgoing http requests. Each of them roughly take 3-5 seconds. I am not able to achieve more than 200 requests/second on 8 gig Ram, quad core machine.

package main

import (
    "flag"
    "fmt"
    "net/http"
    _"net/url"
    //"io/ioutil"
    "time"
    "log"
    "sync"
    //"os"
    "io/ioutil"
)

// Job holds the attributes needed to perform unit of work.
type Job struct {
    Name  string
    Delay time.Duration
}

func requestHandler(w http.ResponseWriter, r *http.Request) {
    // Make sure we can only be called with an HTTP POST request.
    fmt.Println("in request handler")
    if r.Method != "POST" {
        w.Header().Set("Allow", "POST")
        w.WriteHeader(http.StatusMethodNotAllowed)
        return
    }

    // Set name and validate value.
    name := r.FormValue("name")
    if name == "" {
        http.Error(w, "You must specify a name.", http.StatusBadRequest)
        return
    }

    delay := time.Second * 0

    // Create Job and push the work onto the jobQueue.
    job := Job{Name: name, Delay: delay}
    //jobQueue <- job

    fmt.Println("creating worker")
    result := naiveWorker(name, job)
    fmt.Fprintf(w, "your task %s has been completed ,here are the results : %s", job.Name, result)

}

func naiveWorker(id string, job Job) string {
    var wg sync.WaitGroup
    responseCounter := 0;
    totalBodies := "";
    fmt.Printf("worker%s: started %s\n", id, job.Name)

    var urls = []string{
        "https://someurl1",
        "https://someurl2",
        "https://someurl3",
        "https://someurl4",
        "https://someurl5",
    }

    for _, url := range urls {
        // Increment the WaitGroup counter.

        wg.Add(1)
        // Launch a goroutine to fetch the URL.
        go func(url string) {

            // Fetch the URL.
            resp, err := http.Get(url)
            if err != nil {
                fmt.Printf("got an error")
                //  panic(err)

            } else {
                defer resp.Body.Close()
                body, err := ioutil.ReadAll(resp.Body)
                if err != nil {
                    totalBodies += string(body)
                }
            }
            responseCounter ++
            // Decrement the counter when the goroutine completes.
            defer wg.Done()

        }(url)
    }
    wg.Wait()
    fmt.Printf("worker%s: completed %s with %d calls\n", id, job.Name, responseCounter)
    return totalBodies
}

func main() {
    var (
        port = flag.String("port", "8181", "The server port")
    )
    flag.Parse()

    // Start the HTTP handler.
    http.HandleFunc("/work", func(w http.ResponseWriter, r *http.Request) {
        requestHandler(w, r)
    })
    log.Fatal(http.ListenAndServe(":" + *port, nil))
}

I have the following questions:

  1. The http connections get reset when number of concurrent threads go above 1000. Is this acceptable/intended behaviour?

  2. if I write go requestHandler(w,r) instead of requestHandler(w,r) I get http: multiple response.WriteHeader calls

JimB
  • 104,193
  • 13
  • 262
  • 255
lambdaexpression
  • 33
  • 1
  • 1
  • 3
  • 1
    Check ulimit, maxfiles, and somaxconn. Possibly system runs out of limits resources. – Eugene Lisitsky Nov 20 '17 at 06:06
  • 1
    I concur with Eugene: 1024 is a typical limit on the number of open files (which, on UNIX, also includes sockets) on a typical commodity Linux-based OS. – kostix Nov 20 '17 at 06:40
  • just a suggestion.. your use case can actually use channels without an issue instead of wait groups. That would also prevent any mangled/malformed string you might get as the output due to race conditions. Also, the handler is already handled inside a goroutine, to answer your question, so placing it inside another won't help – omu_negru Nov 20 '17 at 06:41
  • If you're using Linux, the approach is to raise the so-called "hard" limit for the `nofile` parameter in the `/etc/security/limits.conf` — setting it for the user and/or a group of user which includes the one used to run your server. Then, before starting the server, issue the `ulimit -n hard` in the shell—to bring the current "soft" limit (which will have a sensible default, usually 1024, as was discussed) up to the "hard" one. The server spawned aftwerwards will inherit this setting and use it. Call `ulimit -n` or `ulimit -a` to see the current setting(s). – kostix Nov 20 '17 at 07:08
  • Pass it the `-H` option to see the configured hard limit(s) as by default it displays the soft ones. – kostix Nov 20 '17 at 07:08
  • Once you have this sorted, I'd recomment looking closely at the documentation on the `http.Transport` type, — at its knobs controlling keeping alive idle HTTP connections. By default, it keeps around 2 idle connections per destination host; you might want to tweak these settings: both the size of the idle connection pool and the timeout a connection is allowed to be kept there. This might improve your performance if you're pressing your HTTP peers sufficiently hard. – kostix Nov 20 '17 at 07:11
  • Thanks Eugene Lisitsky changing the limit worked. kostik will read up on http.transport. – lambdaexpression Nov 20 '17 at 09:52
  • @user3451713 I wonder what is the limit are you getting now? – k1m190r Nov 20 '17 at 10:49
  • 1
    @biosckon Tested upto 10K concurrent requests and it worked fine. Surprisingly the memory used was only ~120 MB. However CPU was constantly at 50-60 %. – lambdaexpression Nov 21 '17 at 11:56

2 Answers2

4

An http handler is expected to run synchronously, because the return of the handler function signals the end of the request. Accessing the http.Request and http.ResponseWriter after the handler returns is not valid, so there is no reason to dispatch the handler in a goroutine.

As the comments have noted, you can't open more file descriptors than the process ulimit allows. Besides increasing the ulimit appropriately, you should have a limit on the number of concurrent requests that can be dispatched at once.

If you're making many connections to the same hosts, you should also adjust your http.Transport accordingly. The default idle connection per host is only 2, so if you need more than 2 concurrent connections to that host, the new connections won't be reused. See Go http.Get, concurrency, and "Connection reset by peer"

If you connect to many different hosts, setting Transport.IdleConnTimeout is a good idea to get rid of unused connections.

And as always, on a long running service you will want to make sure that timeouts are set for everything, so that slow or broken connections don't hold unnecessary resources.

JimB
  • 104,193
  • 13
  • 262
  • 255
1

Q2: multiple response.WriteHeader calls: If you don't set your headers go with do it for you. When you launch a go routine the servers sees that there is no header set yet, and then is sets automatically, but after that your go routine does it again.

Q1:The http connections get reset when number of concurrent threads go above 1000: Go routines are not system threads, that means you can run a more routines than threads your system usually can. In a worst case scenario your request run concurrently instead of parallel. I don't see anything wrong in your code, that makes me thing that there is the server you make the request throttling you and dropping your requests because you may exceed the max connection that server allow for one ip.

And you can also modify http.Transport parameter in your request, (see docs) to see if this help your situation about memory consumption and concurrent connections.

tr := &http.Transport{
    MaxIdleConns:       10,
    IdleConnTimeout:    30 * time.Second,
    DisableCompression: true,
}
client := &http.Client{Transport: tr}
resp, err := client.Get("https://example.com")
stefany
  • 21
  • 2