-3

I am new to golang, While doing this poc, I noticed a weird behaviour while running a for loop.

    package main;

    import (
        "log"
        "strings"
        "time"
    )

    type data struct {
        elapseTime int64
        data string
    }



    func main(){
        for i := 0 ; i < 10; i++{

            c := make(chan data);
            go removeDuplicates("I love oranges LALALA I LOVE APPLES LALALA XD", c);
            log.Printf("%v", <- c);
        }
    }





    func removeDuplicates(value string , c chan data){
        start := time.Now();
        var d = data{};
        var r string;
        value = strings.ToLower(value);
        var m = make(map[string]string);
        splitVals := strings.Split(value, " ");
        for _, element := range splitVals {
            if _, ok := m[element]; ok == false {
                m[element] = element;
            }
        }
        for k, _ := range m {
            r = r + k + " ";
        }
        d.data = strings.TrimRight(r, "");
        d.elapseTime = time.Since(start).Nanoseconds();
        c <- d;
    }

Effectively, what I am trying to achieve is to remove duplicates of a simple string and print those information out along with the time taken. a loop runs through a go routine 10 times, waiting for responses to come via the channel.

2019/05/24 00:55:49 {18060 i love oranges lalala apples xd }
2019/05/24 00:55:49 {28930 love oranges lalala apples xd i }
2019/05/24 00:55:49 {10393 i love oranges lalala apples xd }
2019/05/24 00:55:49 {1609 oranges lalala apples xd i love }
2019/05/24 00:55:49 {1877 i love oranges lalala apples xd }
2019/05/24 00:55:49 {1352 i love oranges lalala apples xd }
2019/05/24 00:55:49 {1708 i love oranges lalala apples xd }
2019/05/24 00:55:49 {1268 apples xd i love oranges lalala }
2019/05/24 00:55:49 {1736 oranges lalala apples xd i love }
2019/05/24 00:55:49 {1037 i love oranges lalala apples xd }

Here's what I see: the first few prints of the loop(doesn't matter if its single loop or 100x loop) will be significantly slower than the remainder of the loops. Is there a reason why this is so? (execution time is in nanoseconds btw)

Edit: removing the switch part since people are confused about the question.

Bocky
  • 483
  • 1
  • 7
  • 27
  • 4
    Just a tip, showing raw performance alone will probably not be enough to convince anyone to switch language. I would recommend focusing on how easy it is to write common tasks compared to java. – Icy Creature May 23 '19 at 17:27
  • 4
    This is not a very good benchmark; it doesn't actually measure much of anything, and it doesn't measure it very accurately. You should at least be using the benchmarking facility built into the `testing` package, which runs the benchmarks enough times to get a stable result. – Adrian May 23 '19 at 17:28
  • 1
    Raw performance is also probably not the thing to prioritize; performance between the two is roughly comparable. Go tends to use less memory, but that's also not incredibly relevant. Look at development ease, reading ease, learning ease, building ease & speed, deployment ease, and the standard library. – Adrian May 23 '19 at 17:29
  • Adrian, i think you misunderstood the point. I am not trying to prove the functionality of the goroutine. This is just a entry demo into another bigger demo project. The question here is why this behaviour is being emitted from the logs running the exact same routine. – Bocky May 23 '19 at 17:35

1 Answers1

2

Concurrency is not parallelism. This particular use of a channel turns out to be pretty similar to just returning values from removeDuplicates, except there's extra overhead from two goroutines needing to coordinate over their use of a channel.

Specifically:

  • Each iteration of the loop has its own channel, and each channel can hold only one element.
  • The loop cannot continue to the next iteration until all statements have executed, including the call to log.Printf that blocks until a value is received from the channel.
  • removeDuplicates detects how much real-time has elapsed, not how much time was spent working on its problem. This is one of many reasons the comments are saying it's not a great benchmark in the first place.

Speculative: It's possible that in the first few iterations of the loop, the removeDuplicates goroutine initializes start and then yields execution time to the main goroutine. Then the main goroutine immediately checks the mutex on c, discovers it can't do anything yet, and yields back to the scheduler, and all this checking and context-switching adds thousands of nanoseconds (caring about this is typically a smell of microbenchmarking) to the removeDuplicates goroutine's real-time execution. After a few iterations, something (the Go runtime, perhaps) picks up on the fact that main is never able to make progress until removeDuplicates returns, and the context switch is avoided.

I know you're more interested in explanations than advice at this point, but I'd feel pretty irresponsible if I didn't point out that benchmarks comparing Go to Java already exist. Even if you want to write your own, I'd recommend using a similar approach: define the benchmark program in terms of what it needs to accomplish, and then use the best tools available in each language (or framework) to get the job done with good performance.

Jesse Amano
  • 800
  • 5
  • 16