I'm trying to understand best practices for Golang concurrency. I read O'Reilly's book on Go's concurrency and then came back to the Golang Codewalks, specifically this example:
https://golang.org/doc/codewalk/sharemem/
This is the code I was hoping to review with you in order to learn a little bit more about Go. My first impression is that this code is breaking some best practices. This is of course my (very) unexperienced opinion and I wanted to discuss and gain some insight on the process. This isn't about who's right or wrong, please be nice, I just want to share my views and get some feedback on them. Maybe this discussion will help other people see why I'm wrong and teach them something.
I'm fully aware that the purpose of this code is to teach beginners, not to be perfect code.
Issue 1 - No Goroutine cleanup logic
func main() {
// Create our input and output channels.
pending, complete := make(chan *Resource), make(chan *Resource)
// Launch the StateMonitor.
status := StateMonitor(statusInterval)
// Launch some Poller goroutines.
for i := 0; i < numPollers; i++ {
go Poller(pending, complete, status)
}
// Send some Resources to the pending queue.
go func() {
for _, url := range urls {
pending <- &Resource{url: url}
}
}()
for r := range complete {
go r.Sleep(pending)
}
}
The main method has no way to cleanup the Goroutines, which means if this was part of a library, they would be leaked.
Issue 2 - Writers aren't spawning the channels
I read that as a best practice, the logic to create, write and cleanup a channel should be controlled by a single entity (or group of entities). The reason behind this is that writers will panic when writing to a closed channel. So, it is best for the writer(s) to create the channel, write to it and control when it should be closed. If there are multiple writers, they can be synced with a WaitGroup.
func StateMonitor(updateInterval time.Duration) chan<- State {
updates := make(chan State)
urlStatus := make(map[string]string)
ticker := time.NewTicker(updateInterval)
go func() {
for {
select {
case <-ticker.C:
logState(urlStatus)
case s := <-updates:
urlStatus[s.url] = s.status
}
}
}()
return updates
}
This function shouldn't be in charge of creating the updates channel because it is the reader of the channel, not the writer. The writer of this channel should create it and pass it to this function. Basically saying to the function "I will pass updates to you via this channel". But instead, this function is creating a channel and it isn't clear who is responsible of cleaning it up.
Issue 3 - Writing to a channel asynchronously
This function:
func (r *Resource) Sleep(done chan<- *Resource) {
time.Sleep(pollInterval + errTimeout*time.Duration(r.errCount))
done <- r
}
Is being referenced here:
for r := range complete {
go r.Sleep(pending)
}
And it seems like an awful idea. When this channel is closed, we'll have a goroutine sleeping somewhere out of our reach waiting to write to that channel. Let's say this goroutine sleeps for 1h, when it wakes up, it will try to write to a channel that was closed in the cleanup process. This is another example of why the writters of the channels should be in charge of the cleanup process. Here we have a writer who's completely free and unaware of when the channel was closed.
Please
If I missed any issues from that code (related to concurrency), please list them. It doesn't have to be an objective issue, if you'd have designed the code in a different way for any reason, I'm also interested in learning about it.
Biggest lesson from this code
For me the biggest lesson I take from reviewing this code is that the cleanup of channels and the writing to them has to be synchronized. They have to be in the same for{} or at least communicate somehow (maybe via other channels or primitives) to avoid writing to a closed channel.