5

I read the famous Why is it faster to process a sorted array than an unsorted array? and I decided to play around and experiment with other languages such as Swift. I was surprised by the run time differences between 2 very similar snippets of code.

In Swift one can access elements in an array either in a direct way or with a subscript while in a for-in loop. For instance this code:

for i in 0..<size {
    sum += data[i]
}

Could be written:

for element in data {
    sum += element
}

With size the data length and data an array of summable elements.

So, I just implemented in Swift (code bellow) the same algorithm as in the question I mentioned in the first paragraph and what surprised me is that the first method is roughly 5 times faster than the second method.

I don't really know the backstage subscript implementation but I thought that accessing directly the elements in a Swift for-in loop was just syntactic sugar.


Question

My question is what is the difference between the two for-in syntaxes and why it is faster to use subscript?

here is the detail of timers. I'm using Xcode 9.4.1 with Swift 4.1 on an early 2015 MacBook Air with a Commande Line Project.

// Using Direct Element Access
Elapsed Time: 8.506288427
Sum: 1051901000

vs

// Using Subscript
Elapsed Time: 1.483967902
Sum: 1070388000

Bonus question: why the execution is 100 times slower in Swift than in C++ (both executed on the same Mac in a n Xcode project)? For instance 100,000 repetitions in C++ take nearly the same time as 1,000 repetitions in Swift. My first guess is that Swift is a higher level language than C++ and that Swift operates more safety checks for instance.


Here is the Swift code I used, I only modified the second nested loop:

import Foundation
import GameplayKit

let size = 32_768
var data = [Int]()
var sum  = 0
var rand = GKRandomDistribution(lowestValue: 0, highestValue: 255)

for _ in 0..<size {
    data.append(rand.nextInt())
}

// data.sort()

let start = DispatchTime.now()

for _ in 0..<1_000 {
    // Only the following for-in loop changes
    for i in 0..<size {
        if data[i] <= 128 {
            sum += data[i]
        }
    }
}

let stop     = DispatchTime.now()
let nanoTime = stop.uptimeNanoseconds - start.uptimeNanoseconds
let elapsed  = Double(nanoTime) / 1_000_000_000

print("Elapsed Time: \(elapsed)")
print("Sum: \(sum)")
Louis Lac
  • 5,298
  • 1
  • 21
  • 36
  • Were you testing in a Swift playground or a compiled app? – rmaddy Jun 29 '18 at 21:19
  • I'm using a compiled app (commande line project). – Louis Lac Jun 29 '18 at 21:20
  • 2
    I suspect you're not compiling with optimizations. With `-O`, I'm seeing about a 10% cost at most, not 10x. Also you'd need to compare to `-Ounchecked` if you're comparing to C++. – Rob Napier Jun 29 '18 at 21:34
  • Unless you use `-0unchecked`, every basic arithmetic operation does a branch (if checks for overflow, and crashes rather than allowing overflown results to be used) – Alexander Jun 29 '18 at 21:39
  • Ok @RobNapier, I got the same result as you with `-O`. So to sum up when unoptimized the second loop method is far slower than using subscript. – Louis Lac Jun 29 '18 at 21:51
  • 4
    @LouisLac Performance tests are pointless unless you're making optimized builds. The default settings are there for developer convenience (fast compile times, debug symbols) not run-time performance. Iteration in a for loop involves multiple function calls (`Sequence.makeIterator(), IteratorProtocol.next()`), that would slow things down if they're not optimized out (which they are, in `-O`) – Alexander Jun 29 '18 at 21:55
  • Ok great I better understand how all this works together, thanks. – Louis Lac Jun 29 '18 at 22:01

1 Answers1

8

The overall performance output highly depends on the optimizations done by the compiler. If you compile your code with optimizations enabled, you will see the difference between both solutions is minimal.

To demonstrate this, I updated your code adding two methods, one with subscripting and the other using for-in.

import Foundation
import GameplayKit

let size = 32_768
var data = [Int]()
var sum  = 0
var rand = GKRandomDistribution(lowestValue: 0, highestValue: 255)

for _ in 0..<size {
    data.append(rand.nextInt())
}

// data.sort()

func withSubscript() {
  let start = DispatchTime.now()

  for _ in 0..<1_000 {
      for i in 0..<size {
          if data[i] <= 128 {
              sum += data[i]
          }
      }
  }

  let stop    = DispatchTime.now()
  let elapsed = Double(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000

  print("With subscript:")
  print("- Elapsed Time: \(elapsed)")
  print("- Sum: \(sum)")
}

func withForIn() {
  let start = DispatchTime.now()

  for _ in 0..<1_000 {
      for element in data {
          if element <= 128 {
              sum += element
          }
      }
  }

  let stop    = DispatchTime.now()
  let elapsed = Double(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000_000

  print("With for-in:")
  print("- Elapsed Time: \(elapsed)")
  print("- Sum: \(sum)")
}

withSubscript()
withForIn()

I saved that code into a file called array-subscripting.swift.

Then, from the command line, we can run it without any optimizations, like this:

$ swift array-subscripting.swift 
With subscript:
- Elapsed Time: 0.924554249
- Sum: 1057062000
With for-in:
- Elapsed Time: 5.796038213
- Sum: 2114124000

As you mentioned in the post, there is a big difference in performance.

This difference is pretty negligible when the code is compiled with optimizations:

$ swiftc array-subscripting.swift -O
$ ./array-subscripting 
With subscript:
- Elapsed Time: 0.110622556
- Sum: 1054578000
With for-in:
- Elapsed Time: 0.11670454
- Sum: 2109156000

As you can see, both solutions are way faster than before, and very similar on time execution.

Back to your original question, subscripting provides direct access to memory, which is pretty efficient in the case of contiguous arrays, where elements are stored next to each other in memory.

for-in loops, in the other hand, create an immutable copy of each element from the array, which incurs in a performance hit.

Eneko Alonso
  • 18,884
  • 9
  • 62
  • 84
  • 1
    ehm, you should set `sum` to `0` before running every method. It took me same time to understand why your two sums are not the same. – Sulthan Jun 29 '18 at 22:17
  • Haha, yeah. I doubt that would have any impact on the result, but you are correct, I missed that. – Eneko Alonso Jun 29 '18 at 22:32
  • Ok it wasn't clear for me that for-in loops create immutable copy while subscripting is just a way to access memory. That the answer I wanted, thanks! – Louis Lac Jun 30 '18 at 08:14
  • You an also do `for var` if you want to create a mutable copy. – Eneko Alonso Jul 03 '18 at 22:41