5

I'm writing software in Go that does a lot of parallel computing. I want to collect data from worker threads and I'm not really sure how to do it in a safe way. I know that I could use channels but in my scenario they make it more complicated since I have to somehow synchronize messages (wait until every thread sent something) in the main thread.

Scenario

The main thread creates n Worker instances and launches their work() method in a goroutine so that the workers each run in their own thread. Every 10 seconds the main thread should collect some simple values (e.g. iteration count) from the workers and print a consolidated statistic.

Question

Is it safe to read values from the workers? The main thread will only read values and each individual thread will write it's own values. It would be ok if the values are a few nanoseconds off while reading.

Any other ideas on how to implement this in an easy way?

  • Not especially relevant to the question, just a note: go routines are _not_ synonymous with threads. – Jonathan Hall Mar 23 '18 at 08:03
  • Is it important that all the simple values you read are synchronized or is it ok if iteration count and say some sum are not from the same iteration of the worker, i.e. worker increments iteration count and sum, iteration count is read by parent, worker increments count and sum parent reads sum? – Emil L Mar 23 '18 at 08:15
  • @EmilH Yes, it would be ok if values are a bit off. –  Mar 23 '18 at 08:22
  • @LevelingUp As long as it is only for something like monitoring rough progress then you could just read the values. Be aware though that the values can not be trusted. If that is important you will need to do some sort of synchronization. – Emil L Mar 23 '18 at 08:33

1 Answers1

9

In Go no value is safe for concurrent access from multiple goroutines without synchronization if at least one of the accesses is a write. Your case meets the conditions listed, so you must use some kind of synchronization, else the behavior would be undefined.

Channels are used if goroutine(s) want to send values to another. Your case is not exactly this: you don't want your workers to send updates in every 10 seconds, you want your main goroutine to fetch status in every 10 seconds.

So in this example I would just protect the data with a sync.RWMutex: when the workers want to modify this data, they have to acquire a write lock. When the main goroutine wants to read this data, it has to acquire a read lock.

A simple implementation could look like this:

type Worker struct {
    iterMu sync.RWMutex
    iter   int
}

func (w *Worker) Iter() int {
    w.iterMu.RLock()
    defer w.iterMu.RUnlock()
    return w.iter
}

func (w *Worker) setIter(n int) {
    w.iterMu.Lock()
    w.iter = n
    w.iterMu.Unlock()
}

func (w *Worker) incIter() {
    w.iterMu.Lock()
    w.iter++
    w.iterMu.Unlock()
}

Using this example Worker, the main goroutine can fetch the iteration using Worker.Iter(), and the worker itself can change / update the iteration using Worker.setIter() or Worker.incIter() at any time, without any additional synchronization. The synchronization is ensured by the proper use of Worker.iterMu.

Alternatively for the iteration counter you could also use the sync/atomic package. If you choose this, you may only read / modify the iteration counter using functions of the atomic package like this:

type Worker struct {
    iter int64
}

func (w *Worker) Iter() int64 {
    return atomic.LoadInt64(&w.iter)
}

func (w *Worker) setIter(n int64) {
    atomic.StoreInt64(&w.iter, n)
}

func (w *Worker) incIter() {
    atomic.AddInt64(&w.iter, 1)
}
icza
  • 389,944
  • 63
  • 907
  • 827