2

I have a case where I am reading data from 2 different locations (ES and REDIS), I need to read a single value from the fastest source, thus I am firing 2 goroutines, one to get the data from ES, the other to get it from REDIS.

once data has been fetched from one of the goroutines, the other goroutine must be canceled completely not to waste CPU.

simplified:

func A(){
    go funcB(){

    }()

    go funcC(){

    }()

    data := <-channel // 
}

now once the data is received, funcA or funcB must be canceled, no matter what they were doing (I don't care for their output anymore, they are just wasting CPU)

what would be the most efficient way to do it? can it be done using channels only?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Rami Dabain
  • 4,709
  • 12
  • 62
  • 106
  • Without seeing the code for each request, it's more likely that they are not using any CPU, and are waiting on network. How do you intend to cancel those? – JimB Sep 02 '16 at 13:28
  • @JimB No idea buddy, thats why My question is there, mind if i add your comment to the question? as it describes a part of my problem as well... – Rami Dabain Sep 02 '16 at 13:45
  • That was more of a rhetorical questions, since in effect it may not be feasible, required, or even more efficient to do so. This also depends on the server (I don't think either of those provide an API to cancel an in-progress request), and the client library (is there a useful hook to force the connection closed to begin with). – JimB Sep 02 '16 at 13:58
  • 1
    You may use timeouts with `select` to "move on", but you can't stop a goroutine from the outside: [**Cancel a blocking operation in Go**](http://stackoverflow.com/questions/28240133/cancel-a-blocking-operation-in-go). The goroutine (the code executed in the goroutine) has to support the cancellation. – icza Sep 02 '16 at 14:05

2 Answers2

7

The context package provides cancelation, timeout and deadline contexts for this purpose. Here you can see a cancelation example, and we wait for the slower goroutine to print the cancelled message:

ctx, cancel := context.WithCancel(context.Background())

// buffer the channel for extra results returned before cancelation
data := make(chan string, 2)

var wg sync.WaitGroup
wg.Add(1)
go func() {
    defer wg.Done()
    select {
    case <-time.After(100 * time.Millisecond):
        data <- "A complete"
    case <-ctx.Done():
        fmt.Println("A cancelled")
    }
}()

wg.Add(1)
go func() {
    defer wg.Done()
    select {
    case <-time.After(200 * time.Millisecond):
        data <- "B complete"
    case <-ctx.Done():
        fmt.Println("B cancelled")
    }
}()

resp := <-data
cancel()
fmt.Println(resp)
wg.Wait()

https://play.golang.org/p/vAhksjKozW

JimB
  • 104,193
  • 13
  • 262
  • 255
  • What if there were a time.sleep(1 minute/second) before data <- "B complete" and println("pased") after data <- "B complete" – Rami Dabain Sep 02 '16 at 13:55
  • @RonanDejhero: I'm not sure what you mean. You can try it yourself: https://play.golang.org/p/CUCgYeVHO6. This falls into the area of what it means to "cancel" a remote request. One would usually set the clients with a reasonable timeout so that the outstanding requests take care of themselves. This is also very dependent on what the client code looks like, and how you are handling the data. – JimB Sep 02 '16 at 14:01
  • the only time consuming code i found in the example is "time.After(XXX):" Not sure how can that be replaced with "connect to ES, do some logic on value and return value" – Rami Dabain Sep 02 '16 at 14:10
  • @RonanDejhero: that stands in for any operation you want, and you would return your value over a channel. Optionally, some code chooses to check the `Err()` value of the context between operations to determine if the request has been cancelled. – JimB Sep 02 '16 at 14:17
  • 2
    @RonanDejhero: To expand, say the request was http based, you could take the http.Request, and use [`WithContext`](https://golang.org/pkg/net/http/#Request.WithContext) to automatically cancel the http request when you cancel the context. As you can see, the actual cancelation code is going to be unique to the client library and codepath around the handling of data. – JimB Sep 02 '16 at 14:26
  • Managed to do it, this is a great way to do it as it also allows for timeout – Rami Dabain Sep 02 '16 at 14:42
  • If two inputs ready at the same time this sample code panics, (you need sync/Lock) see(run): https://play.golang.org/p/9CJ2_NhpWN –  Sep 02 '16 at 15:15
  • @Amd: good point. It's blocking on the second send to `data`, and in this case it's better to use a buffered channel to prevent blocking at rather than add extra synchronization. – JimB Sep 02 '16 at 15:21
2

You have some Options depending to your real use case:

1- Using two goroutines:

This needs sync/Lock:
Try this simulated sample (The Go Playground):

package main

import (
    "fmt"
    "math/rand"
    "sync"
    "time"
)

func main() {
    rand.Seed(time.Now().Unix())
    time.AfterFunc(time.Duration(rand.Intn(1000))*time.Millisecond, func() { ES <- 101 })
    time.AfterFunc(time.Duration(rand.Intn(1000))*time.Millisecond, func() { REDIS <- 102 })

    go B()
    go C()

    data := <-channel

    fmt.Println(data)

}

func B() {
    check := true
    data := 0
    for {
        select {
        case <-quit:
            return
        case data = <-ES: // receive data
        }
        if check {
            mx.Lock()
            //defer mx.Unlock()
            if mx.done {
                mx.Unlock()
                return
            }
            check = false
            close(quit)
            mx.done = true
            mx.Unlock()
        }
        fmt.Println("ES ready")
        channel <- data
    }
}

func C() {
    check := true
    data := 0
    for {
        select {
        case <-quit:
            return
        case data = <-REDIS: // receive data
        }
        if check {
            mx.Lock()
            //defer mx.Unlock()
            if mx.done {
                mx.Unlock()
                return
            }
            check = false
            close(quit)
            mx.done = true
            mx.Unlock()
        }
        fmt.Println("REDIS ready")
        channel <- data
    }
}

var (
    channel = make(chan int)
    ES      = make(chan int)
    REDIS   = make(chan int)
    quit    = make(chan struct{})
    mx      lockdown
)

type lockdown struct {
    sync.Mutex
    done bool
}

2- In this sample you just start one goroutine B or C:
see this pseudo code:

func main() { 
    go A()
    data := <-channel
    fmt.Println(data)
}

func A() {
    for{
        if ES ready 
            go B(data)
            return
        if REDIS ready
            go C(data)
            return
    }
}

You may start A goroutine, in A goroutine it detects which input is ready e.g. ES or REDIS, then starts B or C goroutine accordingly:

Try this simulated sample (The Go Playground):
AfterFunc is just for simulation, in real code you don't need it, it simulates random timing for one input.

package main

import (
    "fmt"
    "math/rand"
    "time"
)

func main() {
    rand.Seed(time.Now().Unix())
    time.AfterFunc(time.Duration(rand.Intn(1000))*time.Millisecond, func() { ES <- 101 })
    time.AfterFunc(time.Duration(rand.Intn(1000))*time.Millisecond, func() { REDIS <- 102 })
    go A()

    data := <-channel

    fmt.Println(data)

}

func A() {
    select {
    case data := <-ES:
        go B(data)
        return
    case data := <-REDIS:
        go C(data)
        return
    }
}

func B(data int) {
    for {
        fmt.Println("ES ready")
        channel <- data
        data = <-ES
    }
}
func C(data int) {
    for {
        fmt.Println("REDIS ready")
        channel <- data
        data = <-REDIS
    }
}

var (
    channel = make(chan int)
    ES      = make(chan int)
    REDIS   = make(chan int)
)

output from run 1:

REDIS ready
102

output from run 2:

ES ready
101

  • Where can i stop the process in "REDIS <- 102" ? once ES has returned ? – Rami Dabain Sep 02 '16 at 14:02
  • @Amd: option 1 does start separate goroutines, you're just hiding the `go` calls behind `AfterFunc`. – JimB Sep 02 '16 at 14:28
  • @JimB: `AfterFunc` is just for simulation, in real code you don't need it, it simulates random timing for one input. I will clarify this more. –  Sep 02 '16 at 14:32