I'm practicing a challenge to calculate factorials by splitting calculations into 100 groups concurrently, I solved lots of issue on WaitGroups, but still in the calculateFactorial
function I got the deadlock on range over channel part.
Wish someone could point the issue here, thank you.
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
wg.Add(2)
in := make (chan int)
out := make (chan float64)
out = calculateFactorial(genConcurrentGroup(in, &wg), &wg)
go func() {
in <- 10
close(in)
}()
fmt.Println(<-out)
wg.Wait()
}
//split input number into groups
//the result should be a map of [start number, number in group]
//this is not heavy task so run in one go routine
func genConcurrentGroup(c chan int, wg *sync.WaitGroup) chan map[int]int{
out := make(chan map[int]int)
go func() {
//100 groups
total:= <- c
wg.Done()
//element number in group
elemNumber := total / 100
extra := total % 100
result := make(map[int]int)
if elemNumber>0{
//certain 100 groups
for i:=1 ;i<=99;i++{
result[(i-1) * elemNumber + 1] = elemNumber
}
result[100] = extra + elemNumber
}else{
//less than 100
for i:=1;i<=total;i++{
result[i] = 1
}
}
out <- result
close(out)
}()
return out
}
//takes in all numbers to calculate multiply result
//this could be heavy so can do it 100 groups together
func calculateFactorial(nums chan map[int]int, wg *sync.WaitGroup) chan float64{
out := make(chan float64)
go func() {
total:= <- nums
wg.Done()
fmt.Println(total)
oneResult := make(chan float64)
var wg2 sync.WaitGroup
wg2.Add(len(total))
for k,v := range total{
fmt.Printf("%d %d \n",k,v)
go func(k int, v int) {
t := 1.0
for i:=0;i<v;i++{
t = t * (float64(k) + float64(i))
}
fmt.Println(t)
oneResult <- t
wg2.Done()
}(k,v)
}
wg2.Wait()
close(oneResult)
result := 1.0
for n := range oneResult{ //DEADLOCK HERE! Why?
result *= n
}
fmt.Printf("Result: %f\n",result)
out <- result
}()
return out
}
Update:
Thanks to Jessé Catrinck's answer which fixed the issue in the above code by simply change the oneResult
to a buffered channel. However in https://stackoverflow.com/a/15144455/921082 there's a quote
You should never add buffering merely to fix a deadlock. If your program deadlocks, it's far easier to fix by starting with zero buffering and think through the dependencies. Then add buffering when you know it won't deadlock.
So could anyone please help me figure out how to not to use buffered channel for this? Is it possible?
Furthermore, I did some research on what exactly causes a deadlock.
Some quote like from https://stackoverflow.com/a/18660709/921082,
If the channel is unbuffered, the sender blocks until the receiver has received the value. If the channel has a buffer, the sender blocks only until the value has been copied to the buffer; if the buffer is full, this means waiting until some receiver has retrieved a value.
Said otherwise :
when a channel is full, the sender waits for another goroutine to make some room by receiving
you can see an unbuffered channel as an always full one : there must be another goroutine to take what the sender sends.
So in my original situation, what is probably causing the deadlock is maybe :
the range over channel is not receiving ?
the range over channel is not receiving on a separated go routine. ?
the
oneResult
is not properly closed, so range over channel doesn't know where's the end?
for number 3, I don't know if there's anything wrong about closing the oneResult
before range over, since this pattern appears on many examples on the internet. If it is number 3, could it be something wrong in the wait group?
I got another article very similar to my situation https://robertbasic.com/blog/buffered-vs-unbuffered-channels-in-golang/, in its
second lesson learned, he uses a for { select {} }
infinite loop as an alternative to range over, it seems solved his problem.
go func() {
for{
select {
case p := <-pch:
findcp(p)
}
}
}()
Lesson number 2 — an unbuffered channel can’t hold on to values (yah, it’s right there in the name “unbuffered”), so whatever is sent to that channel, it must be received by some other code right away. That receiving code must be in a different goroutine because one goroutine can’t do two things at the same time: it can’t send and receive; it must be one or the other.
Thanks