3

I am dealing with very large arrays in Swift and came to the conclusion that those are extremely slow when adding elements to it.

I am observing those issues mainly when I use arrays in a Dictionary.

ex : var array = [String : [String]]

Therefore I decided to benchmark a very simple array test using playground, thinking the issue came from the array itself :

var arr = [Int]()

for i in 0..<1_000_000 {
    arr.append(i)
}

This code takes forever to complete. Now the same code in C# with a real List, takes not even a second.

IList list = new List<int>();

for (int i = 0; i < 1000000; i++) {
   list.Add(i);
}

I know that Arrays in Swift aren't lists like in other languages where you have the flexibility to pick an ArrayList, LinkedList. Swift, re-allocates every time you add a new element, and basically puts all the array in some newer bigger space.

How can we solve this ?

EDIT 1: Hamish pointed out that using a Xcode Playground environment is a terrible idea for performance tracking. He is right, Swift arrays are as fast as C# when not used in Playground.

EDIT 2: Performance issues with arrays are not because of the array itself, but only when using arrays inside a dictionary. See answers below.

Scaraux
  • 3,841
  • 4
  • 40
  • 80
  • 5
    Are you using a playground by any chance? They're horrible when it comes to performance, try in a command line tool (with optimisations enabled if you really want speed). Runs in 0.026 seconds for me. – Hamish Oct 14 '18 at 21:33
  • You are right, Playground is extremely slow. But in one of my projects, we used a Dictionary with Arrays as values, and our indexing process took 57 minutes compared to only one in c#. For the same code. – Scaraux Oct 14 '18 at 21:36
  • Could you show us that code? At a guess, I'd say you're running into the copy-on-write problem where the array in the dictionary is never uniquely referenced while you mutate it so it re-allocates on each mutation (see https://stackoverflow.com/q/44632251/2976878 for more info). Though that doesn't explain 57 minutes worth of slow down, it might be a contributing factor. – Hamish Oct 14 '18 at 21:40
  • 2
    "Swift, re-allocates every time you add a new element" – that's not usually true, `Array` employs an exponential growth strategy where the capacity doubles each time you exceed it, allowing for amortised O(1) appending. Although if the array is never uniquely referenced (as I describe above), then you will indeed get reallocations on each append. – Hamish Oct 14 '18 at 21:42
  • @Hamish yes, they have a specific way of allocating so it doesn't do it all the time, but pretty much. – Scaraux Oct 14 '18 at 21:42
  • 1
    I just tried your code on a Command Line Tool and here is the output before and after appending the values to the array: Time Before in seconds: 1539553031.420601. Time After in seconds: 1539553031.4601312. So it took less than 0.039 second. – Mo Abdul-Hameed Oct 14 '18 at 21:43
  • @Hamish the code is here : https://stackoverflow.com/questions/52637092/swift-enormous-dictionary-of-arrays-very-slow – Scaraux Oct 14 '18 at 21:46
  • 2
    These questions are completely different situations. The array issue is because of playgrounds (it's constantly trying to update the debug output as @Hamish notes). The dictionary-of-arrays issue is about copying. Appending to an array is a very fast action. – Rob Napier Oct 14 '18 at 21:53
  • @RobNapier you are right, it is slow because of playground. I told myself that an array within a dictionary would behave the same as if not in a Dictionary. What can be done for the Dictionary issue then ? – Scaraux Oct 14 '18 at 21:55
  • 2
    I need to make dinner, but I'll look at it later. That question is actually a duplicate of one answered earlier. There are many ways to make that fast. Swift Arrays are value types; that doesn't make them bad. It means they have a different trade-off. But there are lots of ways to build data structures that do what you want. – Rob Napier Oct 14 '18 at 22:00
  • @Scaraux As noted in the Q&A I linked to, one option is to use `Dictionary`'s `subscript(_:default:)` which in Swift 4.1 can directly mutate a value in storage. In your case that might look something like this: https://gist.github.com/hamishknight/8d03d89352cd3416e06c012394f341a9. IIRC, Rob has also answered a similar question, which might be a suitable dupe. – Hamish Oct 14 '18 at 22:08
  • Worth noting that this will all be much improved in Swift 5 with the unofficial introduction of generalised accessors using coroutines – many of the previously inefficient accessors (`Dictionary`'s `subscript(_:)` being the most notorioius) are [now using `_modify` accessors](https://github.com/apple/swift/blob/d904b46554ef22699ffba564502d507d420e20c9/stdlib/public/core/Dictionary.swift#L804) that can yield an arbitrary lvalue, allowing them to do things such as deinitialising the value in storage while it's mutated in order to allow for efficient in-place mutations for COW types. – Hamish Oct 14 '18 at 22:20
  • @Hamish your code did the trick. It is much much faster ! – Scaraux Oct 14 '18 at 22:21
  • BTW, this is the particular question I was thinking about https://stackoverflow.com/questions/41079687/dictionary-in-swift-with-mutable-array-as-value-is-performing-very-slow-how-to/41154903#41154903 – Rob Napier Oct 14 '18 at 22:22
  • This is incredibly interesting. I haven't faced any issue of this sort with any other language, so I would have never came up with this solution. Thank you for your time and patience. – Scaraux Oct 14 '18 at 22:25
  • @RobNapier Any sources/documentations I can read to get a better understanding of all this ? Apple docs are too broad about all those issues and I feel like we're playing with advanced concepts, and things that are specific with how Swift was designed. – Scaraux Oct 14 '18 at 22:29
  • @Hamish I would be happy if you posted your code as answer to my other post, it'll be very useful for others – Scaraux Oct 14 '18 at 22:29
  • @Scaraux Sure, I'd be happy to – but first I want to [update my answer here](https://stackoverflow.com/questions/44632251/swift-semantics-regarding-dictionary-access) to note the Swift 5 changes that'll fix the performance trap of `Dictionary`'s `subscript(_:)`. – Hamish Oct 14 '18 at 22:31

1 Answers1

0

Well this issue is mainly cause because of the .append function that it has to create the location and then fill it up,

you can make that slightly faster if you have an idea of the size of the array by giving it a size and that will allocate the space for it, instead of creating allocating and then filling it up trying this code gave me a slightly faster result.

var arr = [Int].init(repeating: 0, count: 1_000_000)

for i in 0..<1_000_000 {
    arr[i] = i
}

This was using a playground, however playground are extremely slow performance compared to a command-line tool or an actual project, but however this code is slightly faster on playground.

Mohmmad S
  • 5,001
  • 4
  • 18
  • 50