Max number of goroutines

Question

How many goroutines can I use painless? For example wikipedia says, in Erlang 20 million processes can be created without degrading performance.

Update: I've just investigated in goroutines performance a little and got such a results:

It looks like goroutine lifetime is more then calculating sqrt() 1000 times ( ~45µs for me ), the only limitation is memory
Goroutine costs 4 — 4.5 KB

score 95 · Accepted Answer · 2016-05-09T06:49:03.620

If a goroutine is blocked, there is no cost involved other than:

memory usage
slower garbage-collection

The costs (in terms of memory and average time to actually start executing a goroutine) are:

Go 1.6.2 (April 2016)
  32-bit x86 CPU (A10-7850K 4GHz)
    | Number of goroutines: 100000
    | Per goroutine:
    |   Memory: 4536.84 bytes
    |   Time:   1.634248 µs
  64-bit x86 CPU (A10-7850K 4GHz)
    | Number of goroutines: 100000
    | Per goroutine:
    |   Memory: 4707.92 bytes
    |   Time:   1.842097 µs

Go release.r60.3 (December 2011)
  32-bit x86 CPU (1.6 GHz)
    | Number of goroutines: 100000
    | Per goroutine:
    |   Memory: 4243.45 bytes
    |   Time:   5.815950 µs

On a machine with 4 GB of memory installed, this limits the maximum number of goroutines to slightly less than 1 million.

Source code (no need to read this if you already understand the numbers printed above):

package main

import (
    "flag"
    "fmt"
    "os"
    "runtime"
    "time"
)

var n = flag.Int("n", 1e5, "Number of goroutines to create")

var ch = make(chan byte)
var counter = 0

func f() {
    counter++
    <-ch // Block this goroutine
}

func main() {
    flag.Parse()
    if *n <= 0 {
            fmt.Fprintf(os.Stderr, "invalid number of goroutines")
            os.Exit(1)
    }

    // Limit the number of spare OS threads to just 1
    runtime.GOMAXPROCS(1)

    // Make a copy of MemStats
    var m0 runtime.MemStats
    runtime.ReadMemStats(&m0)

    t0 := time.Now().UnixNano()
    for i := 0; i < *n; i++ {
            go f()
    }
    runtime.Gosched()
    t1 := time.Now().UnixNano()
    runtime.GC()

    // Make a copy of MemStats
    var m1 runtime.MemStats
    runtime.ReadMemStats(&m1)

    if counter != *n {
            fmt.Fprintf(os.Stderr, "failed to begin execution of all goroutines")
            os.Exit(1)
    }

    fmt.Printf("Number of goroutines: %d\n", *n)
    fmt.Printf("Per goroutine:\n")
    fmt.Printf("  Memory: %.2f bytes\n", float64(m1.Sys-m0.Sys)/float64(*n))
    fmt.Printf("  Time:   %f µs\n", float64(t1-t0)/float64(*n)/1e3)
}

Your conversion from ~4k/per goroutine (this has changed from release to release; and you need to also account for the goroutine stack usage) to a maxium based on memory installed is flawed. The maximum would be based on the smaller of the addressable virtual memory (typically 2-3GB for 32bit OSes), or the physical memory *plus* available swap space, or the memory resource limits of the process (often unlimited). E.g. on a 64bit machine with sane swap setup the physical memory installed is irrelevant to any *limit* (but performance will decrease as swapping starts to occur). — Dave C, Jul 19 '15 at 15:56
I think this contains a race condition, since there is no explicit synchronization to make sure all goroutines have started before the counter is compared to `n`. Are you lucky every time? :) — Filip Haglund, Jan 21 '16 at 15:22
the go playground reports `2758.41 bytes` per goroutine, running go 1.5.1. — Filip Haglund, Jan 21 '16 at 15:25
As @FilipHaglund notes, numbers have changed over time; this is primarily due to starting stack size changing (4 KiB, then 8 KiB in 1.2, then 2 KiB in 1.4). — Nils von Barth, May 09 '16 at 04:42
Running Go 1.13 on an i7 9750H (per goroutine): Memory: 9068.51 bytes, Time: 3.460311 µs, still quite reasonable, can't think of any case where the goroutine itself will be a bottleneck rather than the logic it is performing. — Marco, Mar 25 '20 at 08:16
Increment counter in goroutine? I believe you will create more goroutines than n. — wanghq, Apr 13 '21 at 08:34
On Go version 1.20.1 - Memory: 2660.20 bytes - Time: 0.948240 µs — Arsham Arya, Apr 06 '23 at 11:39

Nils von Barth · Answer 2 · 2016-05-09T04:39:38.110

Hundreds of thousands, per Go FAQ: Why goroutines instead of threads?:

It is practical to create hundreds of thousands of goroutines in the same address space.

The test test/chan/goroutines.go creates 10,000 and could easily do more, but is designed to run quickly; you can change the number on your system to experiment. You can easily run millions, given enough memory, such as on a server.

To understand the max number of goroutines, note that the per-goroutine cost is primarily the stack. Per FAQ again:

…goroutines, can be very cheap: they have little overhead beyond the memory for the stack, which is just a few kilobytes.

A back-of-the-envelop calculation is to assume that each goroutine has one 4 KiB page allocated for the stack (4 KiB is a pretty uniform size), plus some small overhead for a control block (like a Thread Control Block) for the runtime; this agrees with what you observed (in 2011, pre-Go 1.0). Thus 100 Ki routines would take about 400 MiB of memory, and 1 Mi routines would take about 4 GiB of memory, which is still manageable on desktop, a bit much for a phone, and very manageable on a server. In practice the starting stack has ranged in size from half a page (2 KiB) to two pages (8 KiB), so this is approximately correct.

The starting stack size has changed over time; it started at 4 KiB (one page), then in 1.2 was increased to 8 KiB (2 pages), then in 1.4 was decreased to 2 KiB (half a page). These changes were due to segmented stacks causing performance problems when rapidly switching back and forth between segments ("hot stack split"), so increased to mitigate (1.2), then decreased when segmented stacks were replaced with contiguous stacks (1.4):

Go 1.2 Release Notes: Stack size:

In Go 1.2, the minimum size of the stack when a goroutine is created has been lifted from 4KB to 8KB

Go 1.4 Release Notes: Changes to the runtime:

the default starting size for a goroutine's stack in 1.4 has been reduced from 8192 bytes to 2048 bytes.

Per-goroutine memory is largely stack, and it starts low and grows so you can cheaply have many goroutines. You could use a smaller starting stack, but then it would have to grow sooner (gain space at cost of time), and the benefits decrease due to the control block not shrinking. It is possible to eliminate the stack, at least when swapped out (e.g., do all allocation on heap, or save stack to heap on context switch), though this hurts performance and adds complexity. This is possible (as in Erlang), and means you’d just need the control block and saved context, allowing another factor of 5×–10× in number of goroutines, limited now by control block size and on-heap size of goroutine-local variables. However, this isn’t terribly useful, unless you need millions of tiny sleeping goroutines.

Since the main use of having many goroutines is for IO-bound tasks (concretely to process blocking syscalls, notably network or file system IO), you’re much more likely to run into OS limits on other resources, namely network sockets or file handles: golang-nuts › The max number of goroutines and file descriptors?. The usual way to address this is with a pool of the scarce resource, or more simply by just limiting the number via a semaphore; see Conserving File Descriptors in Go and Limiting Concurrency in Go.

[Limiting concurrency in go](http://jmoiron.net/blog/limiting-concurrency-in-go/) is a very nice and simple example — gabuzo, May 13 '17 at 08:33

score 8 · Answer 3 · answered Dec 14 '11 at 19:01

8

That depends entirely on the system you are running on. But goroutines are very lightweight. An average process should have no problems with 100.000 concurrent routines. Whether this goes for your target platform is, of course, something we can't answer without knowing what that platform is.

answered Dec 14 '11 at 19:01

jimt

25,324
8
70
60

Did you have no problems on an ARM based tablet? – peterSO Dec 14 '11 at 19:13
1

Since I don't have no ARM based tablet, I couldn't say. The point still stands though. It's impossible to tell without knowing what the target system can do. – jimt Dec 14 '11 at 19:16
2

In other words, your claim "no problems with 100.000 concurrent routines" is meaningless without proper context. – peterSO Dec 14 '11 at 19:35
5

You are taking it out of context. The sentence reads 'An average process should have no problems with 100.000 concurrent routines'. – jimt Dec 14 '11 at 23:38

score 8 · Answer 4 · edited Jun 20 '20 at 09:12

8

To paraphrase, there are lies, damn lies, and benchmarks. As the author of the Erlang benchmark confessed,

It goes without saying that there wasn't enough memory left in the machine to actually do anything useful. stress-testing erlang

What is your hardware, what is your operating system, where is your benchmark source code? What is the benchmark trying to measure and prove/disprove?

edited Jun 20 '20 at 09:12

Community

1
1

answered Dec 14 '11 at 19:04

peterSO

158,998
31
281
276

score 2 · Answer 5 · answered Apr 28 '14 at 17:08

2

Here's a great article by Dave Cheney on this topic: http://dave.cheney.net/2013/06/02/why-is-a-goroutines-stack-infinite

answered Apr 28 '14 at 17:08

Travis Reeder

38,611
12
87
87

1

Note the linked article is a little out of date. Since Go1.2 there has been [`debug.SetMaxStack`](https://golang.org/pkg/runtime/debug/#SetMaxStack) to override the "new" default maximum per-goroutine stack sizes of 1 GB and 250 MB (on 64 bit and 32 bit systems respectively). I.e. goroutine stack sizes have **not** been infinite since Go1.2. – Dave C Jul 19 '15 at 16:04

score 0 · Answer 6 · edited Jun 20 '20 at 09:12

If the number of goroutine ever become an issue, you easily can limit it for your program:
See mr51m0n/gorc and this example.

Set thresholds on number of running goroutines

Can increase and decrease a counter when starting or stopping a goroutine.
It can wait for a minimum or maximum number of goroutines running, thus allowing to set thresholds for the number of gorc governed goroutines running at the same time.

score -2 · Answer 7 · answered May 18 '21 at 08:11

-2

When the operation was CPU bounded, anything beyond the amount of cores proved to do nothing.

In any other case you will need to test yourself.

template

answered May 18 '21 at 08:11

Alberto Salvia Novella

950
9
16

Max number of goroutines

7 Answers7

Linked