5

Think of a large project which deals with tons of concurrent requests handled by its own goroutine. It happens that there is a bug in the code and one of these requests will cause panic due to a nil reference.

In Java, C# and many other languages, this would end up in a exception which would stop the request without any harm to other healthy requests. In go, that would crash the entire program.

AFAIK, I'd have to have recover() for every single new go routine creation. Is that the only way to prevent entire program from crashing?

UPDATE: adding recover() call for every gorouting creation seems OK. What about third-party libraries? If third party creates goroutines without recover() safe net, it seems there is NOTHING to be done.

Igor Gatis
  • 4,648
  • 10
  • 43
  • 66
  • Yes, pretty much. Note you can use `defer panicRecover()` and call `recover()` from there. – Martin Tournoij May 18 '18 at 20:38
  • 1
    Note that the stdlib http server already does this for you (though having it be the default is regarded by many to be a mistake) – JimB May 18 '18 at 20:44
  • 2
    It is worth finding the cause of the nil dereference since that is most likely a logic issue. – squiguy May 18 '18 at 21:02
  • 1
    Sure, but I'd rather see nil dereference as an error log and fix it ASAP without crashing over and over hurting unrelated requests. – Igor Gatis May 18 '18 at 21:21

2 Answers2

5

If you go the defer-recover-all-the-things, I suggest investing some time to make sure that a clear error message is collected with enough information to promptly act on it.

Writing the panic message to stderr/stdout is not great as it will be very hard to find where the problem is. In my experience the best approach is to invest a bit of time to get your Go programs to handle errors in a reasonable way. errors.Wrap from "github.com/pkg/errors" for instance allows you to wrap all errors and get a stack-trace.

Recovering panic is often a necessary evil. Like you say, it's not ideal to crash the entire program just because one requested caused a panic. In most cases recovering panics will not back-fire, but it is possible for a program to end up in a undefined not-recoverable state that only a manual restart can fix. That being said, my suggestion in this case is to make sure your Go program exposes a way to create a core dump.

Here's how to write a core dump to stderr when SIGQUIT is sent to the Go program (eg. kill pid -QUIT)

go func() {
    // Based on answers to this stackoverflow question:
    // https://stackoverflow.com/questions/19094099/how-to-dump-goroutine-stacktraces
    sigs := make(chan os.Signal, 1)
    signal.Notify(sigs, syscall.SIGQUIT)
    for {
        <-sigs

        fmt.Fprintln(os.Stderr, "=== received SIGQUIT ===")
        fmt.Fprintln(os.Stderr, "*** goroutine dump...")

        var buf []byte
        var bufsize int
        var stacklen int

        // Create a stack buffer of 1MB and grow it to at most 100MB if
        // necessary
        for bufsize = 1e6; bufsize < 100e6; bufsize *= 2 {
            buf = make([]byte, bufsize)
            stacklen = runtime.Stack(buf, true)
            if stacklen < bufsize {
                break
            }
        }
        fmt.Fprintln(os.Stderr, string(buf[:stacklen]))
        fmt.Fprintln(os.Stderr, "*** end of dump")
    }
}()
Tommaso Barbugli
  • 11,781
  • 2
  • 42
  • 41
1

there is no way you can handle panic without recover function, a good practice would be using a middleware like function for your safe function, checkout this snippet

https://play.golang.org/p/d_fQWzXnlAm

CallMeLoki
  • 1,281
  • 10
  • 23