-1

I am trying to make function with worker pool and without, after that I create Benchmark test to compare with which want faster, but I got result that function with worker pool take longer than without.

here is the result

goos: linux
goarch: amd64
BenchmarkWithoutWorker-4        4561        228291 ns/op       13953 B/op       1744 allocs/op
BenchmarkWithWorker-4           1561        651845 ns/op       54429 B/op       2746 allocs/op

the worker pool looks simple and I am following the example from this stackoverflow question here is the scenario of my worker pool and without

 var wg sync.WaitGroup
 
// i will get data from the DB, let say the data lenght about 1000
 const dataFromDB int = 1000 

// numOfProduce in benchmarking value is dataFromDB i defined
func WithoutWorker(numOfProduce int) {
     for i := 0; i < numOfProduce; i++ {
         if doSomething(fmt.Sprintf("data %d", i)) != nil {
             fmt.Println("error")
         }
     }
 }

 func WithWorker(numWorker int) {
     jobs := make(chan *Job, dataFromDB)
     result := make(chan *Result, 10)
     for i := 0; i < numWorker; i++ {
         wg.Add(1)
         go consume(i, jobs, result)
     }

     go produce(jobs)
     wg.Wait()
    
     // i might analyze the result channel 
     // here later to return any error to client if any error i got
 }

 func doSomething(str string) error {
     if str == "" {
         return errors.New("empty")
     }

     return nil
 }

 func consume(workerID int, jobs <-chan *Job, result chan<- *Result) {
     defer wg.Done()
     for job := range jobs {
         //log.Printf("worker %d", workerID)
         //log.Printf("job %v", job.ValueJob)
         err := doSomething(job.ValueJob)
         if err != nil {
             result <- &Result{Err: err}
         }
     }
 }

 func produce(jobs chan<- *Job) {
     for i := 1; i < dataFromDB; i++ {
         jobs <- &Job{
             Id:       i,
             ValueJob: fmt.Sprintf("data %d", i),
         }
     }
     close(jobs)
 }

am I missing something in my worker pool?

for the benchmark test code, it looks like codes from tutorial outs there :) just simple codes to call the functions and I added b.ReportAllocs() as well

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Pocket
  • 309
  • 3
  • 11

1 Answers1

2

If the work you are splitting up on several goroutines / workers is less than the overhead of the communication to send the job to the goroutine and receive the result, then it is faster to do the work on a single machine.

In your example you are doing (almost) no work:

func doSomething(str string) error {
     if str == "" {
         return errors.New("empty")
     }

     return nil
}

Splitting that up on multiple goroutines is going to slow things down.


Example to illustrate:

If you have work that needs 5ns (nano seconds) and you do that 1000 times you have

0.005ms on a single core

If you distribute it across 10 cores it will add communication overhead for each job. Let's say the communication overhead is 1 micro second (1000ns). Now you have 1000 jobs * (5ns + 1000ns) / 10 cores =

0.1005ms on 10 cores

This is just an example with some made up numbers and the math is not exact, but it should illustrate the point: There is a cost to communication that is only worth introducing, if it is (significantly) smaller than the cost of the job itself.

TehSphinX
  • 6,536
  • 1
  • 24
  • 34
  • `doSomething()` function i created here for the example one that i might do in the `consume`, it is just illustration as well, what i got from your answer is, i should split that doSomething() into another goroutine?? right? – Pocket Jan 26 '21 at 11:52
  • 1
    No, you should fill `doSomething` with actual work - ideally the work it should actually do. Only then can you see how much the worker approach will help to speed the work up. – TehSphinX Jan 26 '21 at 12:14
  • 1
    you are right , it was far from the expected, more faster in worker pool if i do things in that `doSomething` but, if that `doSomething` always return no error then it seem without worker the win? `doSomething` in the real my case is `get device info` and then publish message into pub/sub , i do some mocking test in my codes, but it was faster without worker pool to send bulk notification – Pocket Jan 26 '21 at 12:36
  • Cases where distributing the work makes sense either use a lot of CPU or wait for something like a http request, database request, etc. – TehSphinX Jan 26 '21 at 12:39
  • 1
    i see got it, i will mark the answer until no answers again in this question, thx anyway – Pocket Jan 26 '21 at 12:42