9

I am rewriting an old system in GO, and in the old system I was measuring the system load average to know if I should increase the number of thread in my thread-pool.

In go people are not using threadpool or pool of goroutine because starting a goroutine is very cheap. But still running too many goroutine is less efficient then just enough to keep the cpu usage near 100%

Thus is there a way to know how many goroutine are ready to run (not blocked) but not currently running. Or is there a way to get the number of scheduled runnable goroutine "Run queue".

skyde
  • 2,816
  • 4
  • 34
  • 53
  • 3
    It should only barely be the case that 'running too many goroutine is less efficient then just enough to keep the cpu usage near 100%'. Go rarely switches away from a running goroutine unless it's blocked waiting for I/O or on a channel op or sync primitive, so starting lots of goroutines, with the OS thread count matching CPU count via `runtime.GOMAXPROCS(runtime.NumCPU())`, needn't create much additional context-switch overhead. We might be able to help more with additional info on the workload--are your goroutines mostly spinning the CPU, or waiting on a DB, or channel ops, or...? – twotwotwo Nov 04 '13 at 23:51
  • Thanks user2714852 so you are saying if GOMAXPROCS is set to 2 and I start 4 goroutine that never stop and never block the runtime will only run the first 2 and never context-switch to the other one ? – skyde Nov 05 '13 at 17:39
  • 2
    In Go 1.1 that's exactly right: goroutine scheduling is purely cooperative and if there's an endless loop with no I/O, etc., it hogs the thread forever. That's talked about in [Go bug 543](https://code.google.com/p/go/issues/detail?id=543). (You can always call runtime.Gosched() to explicitly yield.) In Go 1.2rc3, ["The scheduler is invoked occasionally upon entry to a function."](http://tip.golang.org/doc/go1.2#preemption); the "occasionally" in that sentence led me to say Go "rarely" forces a switch. That's all I know; I couldn't glean much more from peeking at the Go source just now. – twotwotwo Nov 05 '13 at 20:17
  • 1
    You could add some level of tracking of how many jobs a goroutine processes during runtime as a measure of liveness per goroutine. – Daniel Williams Nov 05 '13 at 23:11
  • to answer twotwotwo question, Please assume the goroutine are doing mostly CPU intensive work. – skyde Nov 05 '20 at 20:04

2 Answers2

10

Check out the runtime/pprof package.

To print "stack traces of all current goroutines" use:

pprof.Lookup("goroutine").WriteTo(os.Stdout, 1)

To print "stack traces that led to blocking on synchronization primitives" use:

pprof.Lookup("block").WriteTo(os.Stdout, 1)

You can combine these with the functions in the runtime package such as runtime.NumGoroutine to get some basic reporting.

This example deliberately creates many blocked goroutines and waits for them to complete. Every 5 seconds it prints the output of the block pprof profile, as well as the number of goroutines still in existence:

package main

import (
    "fmt"
    "math/rand"
    "os"
    "runtime"
    "runtime/pprof"
    "strconv"
    "sync"
    "time"
)

var (
    wg sync.WaitGroup
    m  sync.Mutex
)

func randWait() {
    defer wg.Done()
    m.Lock()
    defer m.Unlock()
    interval, err := time.ParseDuration(strconv.Itoa(rand.Intn(499)+1) + "ms")
    if err != nil {
        fmt.Errorf("%s\n", err)
    }
    time.Sleep(interval)
    return
}

func blockStats() {
    for {
        pprof.Lookup("block").WriteTo(os.Stdout, 1)
        fmt.Println("# Goroutines:", runtime.NumGoroutine())
        time.Sleep(5 * time.Second)
    }
}

func main() {
    rand.Seed(time.Now().Unix())
    runtime.SetBlockProfileRate(1)
    fmt.Println("Running...")
    for i := 0; i < 100; i++ {
        wg.Add(1)
        go randWait()
    }
    go blockStats()
    wg.Wait()
    fmt.Println("Finished.")
}

I'm not sure if that's what you're after, but you may be able to modify it to suit your needs.

Playground

Intermernet
  • 18,604
  • 4
  • 49
  • 61
  • +1 this seems to answer the question. I found this SOA informative http://stackoverflow.com/a/10096686/143225 – Brenden Nov 06 '13 at 00:16
  • 1
    The working sample program is awesome. It may also be worth testing the premise of the question, i.e., whether dynamically tuning goroutine count improves perf enough to be worth it (compared to just launching a ton of goroutines or some other simple strategy). The answer to that may depend on the app. But, either way, this seems to answer the question that was asked. – twotwotwo Nov 06 '13 at 08:16
  • 1
    @user2714852 I agree, I've not yet come across a situation where manually tuning the number of goroutines provides better performance than just launching as many as required and letting the runtime handle the scheduling. I'm sure there are examples, just not very common ones, and from what I can tell this situation should only get better in Go 1.2 and beyond. – Intermernet Nov 06 '13 at 08:58
  • 1
    "I've not yet come across a situation where manually tuning the number of goroutines provides better performance than just launching as many as required" This is very bad idea if any of your goroutines could be doing blocking filesystem or syscall work. Say you have a 6 stage task. Its better to create a worker pool for each stage that has runtime.NumCPU * X (some scaling factor) for each stage. Each stage then reading from a toDo channel until it is closed and puts onto a nextStage channel until it closes it when it toDo channel is closed. This is not for performance but to avoid crashing. – voidlogic Nov 08 '13 at 19:00
  • @voidlogic Very good example. I've yet to do much with Syscalls in Go and I usually use buffered I/O for any FS work so haven't yet hit these problems. – Intermernet Nov 08 '13 at 22:45
  • @Intermernet : The problem even exists for buffered I/O. Currently, when a go routine blocks on an event other than synchronization or network I/O it allocates an OS thread. Having hundreds of thousands of goroutines being promoted to threads will not be good for you system. Normally many goroutines are multiplexed over a fixed set of OS threads by the run-time. Syscalls can include things like DNS lookups too (not just syscall.* calls). – voidlogic Nov 08 '13 at 23:09
  • @voidlogic Thanks! I appreciate the detailed reply. So far most of my Go programs have been happy to load data, process it and then write it out again so most concurrent code happens on in-memory data. I'll keep your comments in mind when I move on to more concurrent i/o things. – Intermernet Nov 08 '13 at 23:24
2

is there a way to know how many goroutine are ready to run (not blocked) but not currently running.?

You will be able (Q4 2014/Q1 2015) to try and visualize those goroutines, with a new tracer being developed (Q4 2014): Go Execution Tracer

The trace contains:

  • events related to goroutine scheduling:
    • a goroutine starts executing on a processor,
    • a goroutine blocks on a synchronization primitive,
    • a goroutine creates or unblocks another goroutine;
  • network-related events:
    • a goroutine blocks on network IO,
    • a goroutine is unblocked on network IO;
  • syscalls-related events:
    • a goroutine enters into syscall,
    • a goroutine returns from syscall;
  • garbage-collector-related events:
    • GC start/stop,
    • concurrent sweep start/stop; and
  • user events.

By "processor" I mean a logical processor, unit of GOMAXPROCS.
Each event contains event id, a precise timestamp, OS thread id, processor id, goroutine id, stack trace and other relevant information (e.g. unblocked goroutine id).

https://lh5.googleusercontent.com/w0znUT_0_xbipG_UlQE5Uc4PbC8Mw1duHRLg_AKTOS4iS6emOD6jnQvSDACybOfCbuSqr2ulkxULXGOBQpZ2IejPHW_8NHufqmn8q5u-fF_MSMCEgu6FwLNtMvowbq74nA

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250