Why is for-in slower than while in swift debugging mode?

Question

Why is for-in slower than while in swift debugging mode ? If you think, yes It was ran in no optimization.

⬇️Below code, Time is compare of for-in and while in no optimization

49999995000000 for-in -- time = 3.3352

4999999950000000 while -- time = 0.3613

⬇️but, If using optimization for speed

49999995000000 for-in -- time = 0.0037

49999995000000 while -- time = 0.0035

I wonder that "why is for-in slower than while in no optimization? and why are for-in and while so fast in optimization? "

import Foundation

func processTime(_ title: String, blockFunction: () -> ()) {
    print()
    let startTime = CFAbsoluteTimeGetCurrent()
    blockFunction()
    let processTime = CFAbsoluteTimeGetCurrent() - startTime
    print(title, " -- time = \(String(format : "%.4f",processTime))")
}

processTime("for-in") {
    var sum = 0
    for i in 0..<10000000 {
        sum += i
    }
    print(sum)
}

processTime("while") {
    var sum = 0
    var i = 0
    while i<10000000 {
        sum += i
        i += 1
    }
    print(sum)
}

Sweeper · Accepted Answer · 2021-04-10T13:19:30.337

7

From a Swift perspective, your for loop actually translates to something like this:

let range = 0..<10000000
var iterator = range.makeIterator()
while let next = iterator.next() {
    ...
}

Notice that's a lot of calls to next on the range's iterator, which has its own state that has to be kept track of, and IndexingIterator.next calls a bunch of protocol methods, dispatching which takes some time too as it has to lookup the witness table. See exactly what calls Iterator.next is going to make here.

If you are in debug mode, none of that is going to be optimised.

Compare that to your while loop, which is basically set something to 0, compare, do the thing in the loop, add 1 to it, repeat. Clearly this is much simpler to do than calling all those methods.

If you enable optimisations however, the compiler can see that the for loop is doing what the while loop does anyway.

Because I find it interesting, I did a little time profile of a loop such as:

var s = ""
for i in 0...10000000 {
    s += "\(i)"
}

80% of the time is spent on next(), and look at how many things it does! My screenshot can't even contain everything. The string concatenating only takes up about 6% (not in the screenshot).

edited Apr 10 '21 at 13:19

answered Apr 09 '21 at 13:26

Sweeper

213,210
22
193
313

Thanks to you, I now could learn makeIterator() even very important concept of virtual method you talk. Thank you so much! – HyunSu Apr 09 '21 at 13:34
@HyunSu I did a time profile for a simple loop. You can see exactly how much time is spent on dispatching the protocol methods. Look at the rows that say "protocol witness ..." – Sweeper Apr 09 '21 at 13:42
1

Oh my god! I didn't know that cool time profiler. I should learn that. Thank you. I really want to see that. It will be very useful for me! Thank you soooo much! – HyunSu Apr 09 '21 at 13:48
Would be interesting to see the difference when we compile with optimisation. Ideally, there would be none. – CouchDeveloper Apr 09 '21 at 13:56
1

You may also find it useful to run `swiftc -emit-sil -O `. This dumps out what Swift is "really" doing, at least at the level of the Swift Intermediate Language. Clang will generally optimize these things even further, but learning to read SIL output is a great skill for learning optimization. You may also be interested in exploring the even more massive difference in `(0..<10000000).reduce(0, +)` in non-optimized vs optimized. – Rob Napier Apr 09 '21 at 14:02
1

@RobNapier Oh, Thank you!! Actually I knew SIL, but thanks to you, people who didn't know that concept could understand "what is SIL?, why is SIL useful in compile time? " even you give some example interesting. Thank you so much! I will have to learn SIL more. – HyunSu Apr 09 '21 at 14:12
1

@RobNapier I was using godbolt to compare the assembly code. I saw the optimised version got rid of all the `IndexingIterator`s, but otherwise couldn’t read the asm. I never knew SIL was a thing! Thank you for telling me about that too! – Sweeper Apr 09 '21 at 14:23
I love godbolt so much. It gives a lot of insight into what's going on. Compare the `-O` version of the for-loop (https://godbolt.org/z/drETfhM7j) with the `-Ounchecked` version (https://godbolt.org/z/d4T7Yd5nc). The entire loop is precomputed and turned into a single `movabs r14, 49999995000000` (i.e. "just store the answer"). But in `-O` mode, it adds checks after every at every step for an overflow. (You can also fix that by using &+ instead of +.) – Rob Napier Apr 09 '21 at 14:56
@Sweeper Hi, my professor, Could you give me information about my new question? https://stackoverflow.com/questions/67032783/why-is-indexingiterator-next-using-dynamic-dispatch – HyunSu Apr 10 '21 at 12:20

Why is for-in slower than while in swift debugging mode?

1 Answers1

Linked