61

I am confused over how Go handles non-blocking I/O. Go's APIs look mostly synchronous to me, and when watching presentations on Go, it's not uncommon to hear comments like "and the call blocks".

Is Go using blocking I/O when reading from files or the network? Or is there some kind of magic that re-writes the code when used from inside a goroutine?

Coming from a C# background, this feels very unintuitive, as in C# we have the await keyword when consuming async APIs, which clearly communicates that the API can yield the current thread and continue later inside a continuation.

TLDR; will Go block the current thread when doing I/O inside a goroutine?, or will it be transformed into a C# like async/await state machine using continuations?

Aryan Beezadhur
  • 4,503
  • 4
  • 21
  • 42
Roger Johansson
  • 22,764
  • 18
  • 97
  • 193
  • Goroutine is very similar to `Task` in C#, which is a higher-level abstraction over the threads themselves. Golang may use one or multiple threads within one goroutine, it depends on what you are going to execute there. For example, if you run I/O could be thread is going to jump of the Goroutine to pick some other task while it's waiting for response. So it is more or less exactly like C# `Task`, just better. 'Await` keyword is a workaround since C# wasn't built to support asynchonous ways from the beggining, but Golang was – OlegI Apr 14 '22 at 09:21

2 Answers2

57

Go has a scheduler that lets you write synchronous code, and does context switching on its own and uses async I/O under the hood. So if you're running several goroutines, they might run on a single system thread, and when your code is blocking from the goroutine's view, it's not really blocking. It's not magic, but yes, it masks all this stuff from you.

The scheduler will allocate system threads when they're needed, and during operations that are really blocking (file I/O is blocking, for example, or calling C code). But if you're doing some simple http server, you can have thousands and thousands of goroutines using actually a handful of "real threads".

You can read more about the inner workings of Go here.

Aryan Beezadhur
  • 4,503
  • 4
  • 21
  • 42
Not_a_Golfer
  • 47,012
  • 14
  • 126
  • 92
  • 6
    I would add that the Go runtime scheduler currently (Go 1.6 and below) multiplexes (epoll on Linux, IOCPs on Windows etc) only network I/O syscalls. All I/O syscalls which hit disk, serial etc occupy a single OS thread each. Whether this is good or bad is debatable in the Go developers' community. The current concensus seems to be that it would be nice to have general async I/O available to the user but from the practical standpoint it's not really *that* useful... – kostix Mar 20 '16 at 16:56
  • 4
    ...as in -- if you have 1000 goroutines writing to the same disk drive at the same time async I/O won't really help; use a dedicated writer and a buffered channel. On a side note: 3rd-party packages exposing the underlying OS's async/poller interface do exist. – kostix Mar 20 '16 at 16:58
  • I find discussion about `file io epoll`. https://github.com/golang/go/issues/18507, and also have another pr https://github.com/golang/go/commit/c05b06a12d005f50e4776095a60d6bd9c2c91fac. I think two posts will solve your question about `no blocking io on file and network, when golang makes thread blocking?` – nuclear Sep 16 '19 at 06:48
37

You should read @Not_a_Golfer answer first and the link he provided to understand how goroutines are scheduled. My answer is more like a deeper dive into network IO specifically. I assume you understand how Go achieves cooperative multitasking.

Go can and does use only blocking calls because everything runs in goroutines and they're not real OS threads. They're green threads. So you can have many of them all blocking on IO calls and they will not eat all of your memory and CPU like OS threads would.

File IO is just syscalls. Not_a_Golfer already covered that. Go will use real OS thread to wait on a syscall and will unblock the goroutine when it returns. Here you can see file read implementation for Unix.

Network IO is different. The runtime uses "network poller" to determine which goroutine should unblock from IO call. Depending on the target OS it will use available asynchronous APIs to wait for network IO events. Calls look like blocking but inside everything is done asynchronously.

For example, when you call read on TCP socket goroutine first will try to read using syscall. If nothing is arrived yet it will block and wait for it to be resumed. By blocking here I mean parking which puts the goroutine in a queue where it awaits resuming. That's how "blocked" goroutine yields execution to other goroutines when you use network IO.

func (fd *netFD) Read(p []byte) (n int, err error) {
    if err := fd.readLock(); err != nil {
        return 0, err
    }
    defer fd.readUnlock()
    if err := fd.pd.PrepareRead(); err != nil {
        return 0, err
    }
    for {
        n, err = syscall.Read(fd.sysfd, p)
        if err != nil {
            n = 0
            if err == syscall.EAGAIN {
                if err = fd.pd.WaitRead(); err == nil {
                    continue
                }
            }
        }
        err = fd.eofError(n, err)
        break
    }
    if _, ok := err.(syscall.Errno); ok {
        err = os.NewSyscallError("read", err)
    }
    return
}

https://golang.org/src/net/fd_unix.go?s=#L237

When data arrives network poller will return goroutines that should be resumed. You can see here findrunnable function that searches for goroutines that can be run. It calls netpoll function which will return goroutines that can be resumed. You can find kqueue implementation of netpoll here.

As for async/wait in C#. async network IO will also use asynchronous APIs (IO completion ports on Windows). When something arrives OS will execute callback on one of the threadpool's completion port threads which will put continuation on the current SynchronizationContext. In a sense, there are some similarities (parking/unparking does looks like calling continuations but on a much lower level) but these models are very different, not to mention the implementations. Goroutines by default are not bound to a specific OS thread, they can be resumed on any one of them, it doesn't matter. There're no UI threads to deal with. Async/await are specifically made for the purpose of resuming the work on the same OS thread using SynchronizationContext. And because there're no green threads or a separate scheduler async/await have to split your function into multiple callbacks that get executed on SynchronizationContext which is basically an infinite loop that checks a queue of callbacks that should be executed. You can even implement it yourself, it's really easy.

Jonnny
  • 4,939
  • 11
  • 63
  • 93
creker
  • 9,400
  • 1
  • 30
  • 47
  • 4
    I think there is a semantic problem with the word "block" here, if the Go routine yields and can be awaken later, then there has to be something inside that code that makes that work, e.g. continuation passing style or something like that. no? so it acts as if it is blocking, but behind the scenes it yields execution and is later awaken and continues? I assume if I have a never ending for loop inside a Go routine, that Go routine could never yeild and the thread currently running the Go routine is forever blocked, right? If that is not the case then I am completely confused here. – Roger Johansson Mar 20 '16 at 18:28
  • 2
    You should read @Not_a_Golfer answer first and the link he provided to understand how goroutines are scheduled. My answer is more like a dipper dive into network IO specifically. Yes, the meaning of "block" depends on the context. From the programmers point of view it does block. Your code blocks and doesn't continue until call returns. From the point of view of the runtime it yields execution. That's why I called it parking - that's a real term used in Go. It's cooperative multitasking and infinite loop does block goroutine and the OS thread forever because it will never yield execution. – creker Mar 20 '16 at 18:41
  • @RogerAlsing yes, if a goroutine never does anything that "blocks", and never calls `runtime.Gosched` (which is an explicit scheduler yield) it will occupy its P indefinitely, preventing other goroutines from running on it. – hobbs Mar 20 '16 at 18:42
  • @RogerAlsing for more on that topic, http://stackoverflow.com/questions/35471480/does-go-block-on-processor-intensive-operations-like-node/35471842#35471842 – hobbs Mar 20 '16 at 18:45
  • 2
    And please explain -1. I understand that my answer can be confusing to someone who doesn't know how Go works inside. But I didn't plan to explain everything. I specifically chose networking IO which is implemented very differently. – creker Mar 20 '16 at 18:49
  • I am not the one downvoting answers here – Roger Johansson Mar 20 '16 at 18:52
  • @RogerAlsing, sorry, didn't mean to ask you personally. Too late to edit now. – creker Mar 20 '16 at 18:58