7

When writing a Command Line Tool (CLT) in Swift, I want to process a lot of data. I've determined that my code is CPU bound and performance could benefit from using multiple cores. Thus I want to parallelize parts of the code. Say I want to achieve the following pseudo-code:

Fetch items from database
Divide items in X chunks
Process chunks in parallel
Wait for chunks to finish
Do some other processing (single-thread)

Now I've been using GCD, and a naive approach would look like this:

let group = dispatch_group_create()
let queue = dispatch_queue_create("", DISPATCH_QUEUE_CONCURRENT)
for chunk in chunks {
    dispatch_group_async(group, queue) {
        worker(chunk)
    }
}
dispatch_group_wait(group, DISPATCH_TIME_FOREVER)

However GCD requires a run loop, so the code will hang as the group is never executed. The runloop can be started with dispatch_main(), but it never exits. It is also possible to run the NSRunLoop just a few seconds, however that doesn't feel like a solid solution. Regardless of GCD, how can this be achieved using Swift?

Bouke
  • 11,768
  • 7
  • 68
  • 102
  • GCD does not require a run loop - but your code may submit blocks to the main thread, in which case you require either to call `dispatch_main` or use a run loop. – CouchDeveloper Feb 19 '15 at 19:00
  • @CouchDeveloper without a run loop the blocks submitted to the main thread would not run right? A run loop is therefor required to run those, even `dispatch_main` also creates a run loop under the hood. – Bouke Feb 19 '15 at 19:04
  • `dispatch_main` does not necessarily need to create a run loop. I actually believe it will not. It is one approach to execute blocks submitted to the main queue. And yes, it never returns, which is probably not what makes much sense in many applications. However, I believe, iff you do not dispatch blocks to the main thread, an application should run fine without dispatch_main and without a run loop (use dispatch_groups to wait for completions). – CouchDeveloper Feb 19 '15 at 19:12

4 Answers4

13

I mistakenly interpreted the locking thread for a hanging program. The work will execute just fine without a run loop. The code in the question will run fine, and blocking the main thread until the whole group has finished.

So say chunks contains 4 items of workload, the following code spins up 4 concurrent workers, and then waits for all of the workers to finish:

let group = DispatchGroup()
let queue = DispatchQueue(label: "", attributes: .concurrent)

for chunk in chunk {
    queue.async(group: group, execute: DispatchWorkItem() {
        do_work(chunk)
    })
}

_ = group.wait(timeout: .distantFuture)
Tom Lokhorst
  • 13,658
  • 5
  • 55
  • 71
Bouke
  • 11,768
  • 7
  • 68
  • 102
6

Just like with an Objective-C CLI, you can make your own run loop using NSRunLoop.

Here's one possible implementation, modeled from this gist:

class MainProcess {
    var shouldExit = false

    func start () {
        // do your stuff here
        // set shouldExit to true when you're done
    }
}

println("Hello, World!")

var runLoop : NSRunLoop
var process : MainProcess

autoreleasepool {
    runLoop = NSRunLoop.currentRunLoop()
    process = MainProcess()

    process.start()

    while (!process.shouldExit && (runLoop.runMode(NSDefaultRunLoopMode, beforeDate: NSDate(timeIntervalSinceNow: 2)))) {
        // do nothing
    }
}

As Martin points out, you can use NSDate.distantFuture() as NSDate instead of NSDate(timeIntervalSinceNow: 2). (The cast is necessary because the distantFuture() method signature indicates it returns AnyObject.)

If you need to access CLI arguments see this answer. You can also return exit codes using exit().

Community
  • 1
  • 1
Aaron Brager
  • 65,323
  • 19
  • 161
  • 287
  • I can't help but think that this is very ugly / unreadable code. Why not manually juggle a few threads instead? – Bouke Feb 18 '15 at 18:37
  • You actually can use "distant future" instead of two seconds, compare http://stackoverflow.com/a/25126900/1187415. `runMode()` will always return if any dispatch sources were processed. – Martin R Feb 18 '15 at 18:37
  • @bouke You could use threads instead, but I don't want to rehash [threads vs. GCD](http://stackoverflow.com/a/13016973/1445366) here. – Aaron Brager Feb 18 '15 at 18:53
  • @AaronBrager I didn't mean to make this a threads vs GCD question. However, I'm looking for a nice way to achieve this use case. To establish a best practice for this use case, with a simple comprehensible code example. – Bouke Feb 18 '15 at 19:56
  • @bouke You could try your hand at [an object-oriented approach](https://github.com/trojanfoe/RunLoopController) but I think it's overkill. – Aaron Brager Feb 18 '15 at 22:11
  • 1
    @AaronBrager Thanks for your reply. Your answer helped me discover that the run loop is not required in the end. If locking the main thread is the goal, `dispatch_group_wait` is your friend. – Bouke Feb 19 '15 at 20:40
5

Swift 3 minimal implementation of Aaron Brager solution, which simply combines autoreleasepool and RunLoop.current.run(...) until you break the loop:

var shouldExit = false
doSomethingAsync() { _ in
    defer {
        shouldExit = true
    }
}
autoreleasepool {
    var runLoop = RunLoop.current
    while (!shouldExit && (runLoop.run(mode: .defaultRunLoopMode, before: Date.distantFuture))) {}
}
Cœur
  • 37,241
  • 25
  • 195
  • 267
2

I think CFRunLoop is much easier than NSRunLoop in this case

func main() {
    /**** YOUR CODE START **/
    let group = dispatch_group_create()
    let queue = dispatch_queue_create("", DISPATCH_QUEUE_CONCURRENT)
    for chunk in chunks {
        dispatch_group_async(group, queue) {
            worker(chunk)
        }
    }
    dispatch_group_wait(group, DISPATCH_TIME_FOREVER)
    /**** END **/
}


let runloop = CFRunLoopGetCurrent()
CFRunLoopPerformBlock(runloop, kCFRunLoopDefaultMode) { () -> Void in
    dispatch_async(dispatch_queue_create("main", nil)) {
        main()
        CFRunLoopStop(runloop)
    }
}
CFRunLoopRun()
rintaro
  • 51,423
  • 14
  • 131
  • 139
  • Thanks for your answer, this code already looks a better than using `NSRunLoop`. However if blocking the main thread is fine (it is in my use case), `dispatch_group_wait` works just fine, I just discovered. – Bouke Feb 19 '15 at 20:41