2

At a basic level, I have a main routine that spawns multiple goroutines to process data. Every time a goroutine processes the data it sends back a struct of varying size (it contains slices and/or arrays allocated from within the goroutine each time).

The data isn't huge (say, a few megabytes) but in general is it more efficient (and is it safe) to transfer a pointer to the data versus a copy of it all? If the data structure is static and I transfer a pointer to it, there's a risk that the structure may change while I'm still processing the result of the previous invocation (if it's fully reallocated then perhaps that's not an issue).

MartyMacGyver
  • 9,483
  • 11
  • 47
  • 67
  • We need more details here, what does that structure look like and why would you be modifying it after you sent it? sending a pointer is safe, however you will still end up with a race if you're modifying it from different goroutines. – OneOfOne Sep 26 '14 at 02:43
  • I'm actually reading it from the receiver, but if it's a pointer to a static structure I would expect a race condition to be a risk. If it's a pointer to something created fresh within the goroutine (e.g., a struct via new or an array/slice via make() it should be handed off cleanly to the receiver (the sender wouldn't modify that particular pointer any further). – MartyMacGyver Sep 26 '14 at 03:21
  • MB sized structs would be very large. You'd want to use a pointer there. You can't have a variable size struct, though. If you're passing around a struct with a slice, and you're worried about modification, a pointer or copy isn't going to make a difference WRT shared mutability. – Dustin Sep 26 '14 at 04:00

2 Answers2

4

It's OK and common to send pointers to values. If the value is large, sending a pointer to the value will be more efficient than sending the value. Run a benchmark to find out how large is "large".

The caveat is that you must prevent unsafe concurrent access to the value. Common strategies for preventing unsafe concurrent access are:

  • Pass ownership of the value from the sender to the receiver. The sender does not access the value after sending it. The receiver can do whatever it wants with the value.
  • Treat the value as read only after sending. Neither the sender or receiver modifies the value after sending.
Simon Fox
  • 5,995
  • 1
  • 18
  • 22
  • Would creating a structure in the goroutine via make() each time (as I do already) and then ship the pointer (instead of the actual data) satisfy passing the ownership? I would expect the pointer to be different on each call in that case. – MartyMacGyver Sep 26 '14 at 03:03
  • Yes, with new - for the post I've simplified my original problem which involved slices sent and parsed via reflection due to a special select construct... crossed my terms. :-D – MartyMacGyver Sep 26 '14 at 03:19
  • It might be a big structure... that's why I want to get more familiar with how Go manages pointers at the same time I learn the best practices. – MartyMacGyver Sep 26 '14 at 03:45
  • Also, for background, I was originally trying to pass a slice back through a channel within a struct and access it via reflection because I'm using an n-channel select handler for all this. It got complicated and I'm refactoring all that (this "make" was coming up quite a bit). – MartyMacGyver Sep 26 '14 at 03:48
  • 2
    @MartyMacGyver as long as you call `make` and return a new slice every time it's fine, also you don't have to return a pointer to the slice, you can safely copy it and it will point to the same data in memory. – OneOfOne Sep 26 '14 at 03:59
  • 1
    For general reference, some other types (`string`, `map`s, `interface`s) also contain pointers internally: see [Russ Cox's Go Data Structures](http://research.swtch.com/godata) and here's my general, rambly [SO answer about passing pointers vs. values](http://stackoverflow.com/questions/23542989/pointers-vs-values-in-parameters-and-return-values/23551970#23551970). – twotwotwo Sep 26 '14 at 05:16
  • @twotwotwo - Thanks! I upvoted your answer on that thread... very handy! – MartyMacGyver Sep 26 '14 at 08:25
4

From my understanding you're trying to do something like:

func watchHowISoar() (ch chan *bigData) {
    ch = make(chan *bigData)
    go func() {
        for i := 0; i < 10; i++ {
            bd := &bigData{i}
            ch <- bd
            // as long as you don't modify bd inside this goroutine after sending it, you're safe.
        }
        close(ch)
    }()
    return
}
func main() {
    for iamaleafOnTheWind := range watchHowISoar() {
        fmt.Printf("%p\n", iamaleafOnTheWind)
    }
}

And it is perfectly safe as long as you don't modify the sent data from the sender after you send it.

If you have doubts, try to run it with go run -race main.go, while the race detector isn't perfect, it will usually detect things like that.

OneOfOne
  • 95,033
  • 20
  • 184
  • 185
  • 1
    Firefly bonus points! Interesting - my idea is similar (thanks for a clean example! Still learning here) but while you're creating a new channel each time (if I'm reading it right) I'm using a persistent channel but sending a newly-allocated data structure each time (so that the reader doesn't end up with the data being pointed to being changed on the next call to the goroutine, possibly while they are still reading it). – MartyMacGyver Sep 26 '14 at 03:43
  • 1
    @MartyMacGyver no the channel is created only once, watchHowISoar gets called only one time then the for loop loops over the returned channel. as long as you allocate the "data" before everytime you send it, you will be fine. – OneOfOne Sep 26 '14 at 03:56