3

I wrote a web app using golang. When it's running in production, there are some goroutines blocked. Here are the information (generated by using pprof):

goroutine 792247 [chan receive, 948 minutes]:
database/sql.(*Tx).awaitDone(0xc4206e2b80)
    /usr/local/go/src/database/sql/sql.go:1440 +0x57
created by database/sql.(*DB).begin
    /usr/local/go/src/database/sql/sql.go:1383 +0x274

The goroutine has been waiting on the channel for 948 minites. Apparently, there's something wrong. But the stack traces seems incomplete. It's not enough for me to find the bug. (I want some stack traces start from my program.)

How can I get the full stack traces of this goroutine? Or are there any other ways to debug this issue?

Update:

I've read the source code of database/sql/sql.go. It turns out database/sql/sql.go:1440 is in a new goroutine. The stack traces are incomplete because previous stack traces belong to the parent goroutine.

My question should be : are there better ways to debug this issue?

Eagle
  • 509
  • 1
  • 4
  • 11
  • Try running `go run -race *.go` – Ari Seyhun Apr 19 '17 at 07:18
  • @Acidic I've already tried that. Maybe it's not some race condition. Thanks anyway. – Eagle Apr 19 '17 at 07:22
  • @Eagle [/database/sql/sql.go:1440](https://golang.org/src/database/sql/sql.go?#L1437) is waiting for a transaction to be committed or rolled back. You can check your code for transactions that are not resolved. – John S Perayil Apr 19 '17 at 07:41
  • @JohnSPerayil I've checked my code. There are a lot of APIs which use SQL transaction. But I cannot find one without `Rollback` or `Commit`. That's why I want to get the full stack traces. Thanks anyway. – Eagle Apr 19 '17 at 07:46
  • To be fair, the trace isn't incomplete. It's being called as `go tx.awaitDone()`. Each goroutine has its own stack, so that is the beginning on the stack for the goroutine you're examining. – Adrian Apr 19 '17 at 19:09
  • @Adrian Yes, you're right. I've updated the question. Now I'm trying to modify some go source code to track the bug. – Eagle Apr 20 '17 at 04:01

1 Answers1

0

I don't think there is any way to get the parent goroutine stack without you having to manually track each go routine invocation and generating an id for it.

In this specific case, what is likely is that you have a transaction that has not been Committed or Rollbacked because an error occurs and the function prematurely exits without calling either.

A good template to avoid the same is to use 'defer'.

func (s Service) DoSomething() (err error) {
    tx, err := s.db.Begin()
    if err != nil {
        return
    }
    defer func() {
        if err != nil {
            tx.Rollback()
            return
        }
        err = tx.Commit()
    }()
    if _, err = tx.Exec(...); err != nil {
        return
    }
    if _, err = tx.Exec(...); err != nil {
        return
    }
    // ...
    return }

Code Reference

PS: Beware of error shadowing.

John S Perayil
  • 6,001
  • 1
  • 32
  • 47
  • Thanks. Using "defer" is far better. I just added `Rollback` or `Commit` before each `return`. However, I suppose it should achieve the same effect as your code. – Eagle Apr 26 '17 at 05:37
  • @Eagle yes, defer produces cleaner code for the same functionality. I can't think of any other case for blocking goroutines in your case, will update the answer if I do. – John S Perayil Apr 26 '17 at 12:01