32

I am trying to get started with using Operations in a side project rather than having closure-based callbacks littered throughout my networking code to help eliminate nested calls. So I was doing some reading on the subject, and I came across this implementation:

open class AsynchronousOperation: Operation {

    // MARK: - Properties

    private let stateQueue = DispatchQueue(label: "asynchronous.operation.state", attributes: .concurrent)

    private var rawState = OperationState.ready

    private dynamic var state: OperationState {
        get {
            return stateQueue.sync(execute: {
                rawState
            })
        }
        set {
            willChangeValue(forKey: "state")
            stateQueue.sync(flags: .barrier, execute: {
                rawState = newValue
            })
            didChangeValue(forKey: "state")
        }
    }

    public final override var isReady: Bool {
        return state == .ready && super.isReady
    }

    public final override var isExecuting: Bool {
        return state == .executing
    }

    public final override var isFinished: Bool {
        return state == .finished
    }

    public final override var isAsynchronous: Bool {
        return true
    }


    // MARK: - NSObject

    private dynamic class func keyPathsForValuesAffectingIsReady() -> Set<String> {
        return ["state"]
    }

    private dynamic class func keyPathsForValuesAffectingIsExecuting() -> Set<String> {
        return ["state"]
    }

    private dynamic class func keyPathsForValuesAffectingIsFinished() -> Set<String> {
        return ["state"]
    }


    // MARK: - Foundation.Operation

    public final override func start() {
        super.start()

        if isCancelled {
            finish()
            return
        }

        state = .executing
        execute()
    }


    // MARK: - Public

    /// Subclasses must implement this to perform their work and they must not call `super`. The default implementation of this function throws an exception.
    open func execute() {
        fatalError("Subclasses must implement `execute`.")
    }

    /// Call this function after any work is done or after a call to `cancel()` to move the operation into a completed state.
    public final func finish() {
        state = .finished
    }
}

@objc private enum OperationState: Int {

    case ready

    case executing

    case finished
}

There are some implementation details of this Operation subclass that I would like some help in understanding.

  1. What is the purpose of the stateQueue property? I see it being used by get and set of the state computed property, but I can't find any documentation that explains the sync:flags:execute and sync:execute methods that they use.

  2. What is the purpose of the three class methods in the NSObject section that return ["state"]? I don't see them being used anywhere. I found, in NSObject, class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String>, but that doesn't seem to help me understand why these methods are declared.

Rob
  • 415,655
  • 72
  • 787
  • 1,044
Nick Kohrn
  • 5,779
  • 3
  • 29
  • 49
  • 7
    FYI, this has a fairly serious mistake. The [`start` documentation](https://developer.apple.com/documentation/foundation/operation/1416837-start) says (emphasis added): "If you are implementing a concurrent operation, you must override this method and use it to initiate your operation. _Your custom implementation must not call `super` at any time."_ – Rob Jan 04 '18 at 21:31
  • 1
    @Rob The keyword in the quoted paragraph is "concurrent". This is relevant only for operations that _do not run in an operation queue. See _isAsynchronous_ property and this paragraph: https://developer.apple.com/documentation/foundation/operation#1661231 – svena Apr 26 '18 at 07:34
  • @Rob None taken, but I invite you to take a closer look. I've been using operations for years. Even Apple's sample code in the famous Advanced Operations WWDC2015 talk overrides and calls super in start(). I do not say that sample code is perfect, but. I have the impression the paragraph that says you should not call super is what could be easily misunderstood. – svena Apr 26 '18 at 08:31
  • Furthermore, what Apple calls asynchronous, is an operation that spawns a new thread from start(). This is not the same as running operations in a background queue. – svena Apr 26 '18 at 09:43
  • 4
    Lol. I know what "asynchronous operation" means. Bottom line, the documentation couldn't be clearer about `start` implementations and they repeat the warning several times: "At no time in your `start()` method should you ever call `super`" and "Your custom implementation [of `start`] must not call `super` at any time." And I wouldn't hold out "Advanced NSOperations" as the exemplar because, as you allude, its problems are infamous and manifold. But, hey, if you want to call `super`, knock yourself out. But future readers should be forewarned, as it can cause problems if you're not careful. – Rob Apr 26 '18 at 15:38

3 Answers3

63

You said:

  1. What is the purpose of the stateQueue property? I see it being used by get and set of the state computed property, but I can't find any documentation that explains the sync:flags:execute and sync:execute methods that they use.

This code "synchronizes" access to a property to make it thread safe. Regarding why you need to do that, see the Operation documentation, which advises:

Multicore Considerations

... When you subclass NSOperation, you must make sure that any overridden methods remain safe to call from multiple threads. If you implement custom methods in your subclass, such as custom data accessors, you must also make sure those methods are thread-safe. Thus, access to any data variables in the operation must be synchronized to prevent potential data corruption. For more information about synchronization, see Threading Programming Guide.

Regarding the exact use of this concurrent queue for synchronization, this is known as the "reader-writer" pattern. This basic concept of reader-writer pattern is that reads can happen concurrent with respect to each other (hence sync, with no barrier), but writes must never be performed concurrently with respect to any other access of that property (hence async with barrier).

For example, you might implement a reader-writer for thread-safety on an array like so:

class ThreadSafeArray<T> {
    private var values: [T]
    private let queue = DispatchQueue(label: "...", attributes: .concurrent)
    
    init(_ values: [T]) {
        self.values = values
    }
    
    func reader<U>(block: () throws -> U) rethrows -> U {
        return try queue.sync {
            try block()
        }
    }
    
    func writer(block: @escaping (inout [T]) -> Void) {
        queue.async(flags: .barrier) {
            block(&self.values)
        }
    }
    
    // e.g. you might use `reader` and `writer` like the following:
    
    subscript(_ index: Int) -> T {
        get { reader { values[index] } }
        set { writer { $0[index] = newValue } }
    }
    
    func append(_ value: T) {
        writer { $0.append(value) }
    }
    
    func remove(at index: Int) {
        writer { $0.remove(at: index)}
    }
}

Obviously, the use of reader-writer in this Operation subclass is even simpler, but the above illustrates the pattern.

You also asked:

  1. What is the purpose of the three class methods in the NSObject section that return ["state"]? I don't see them being used anywhere. I found, in NSObject, class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String>, but that doesn't seem to help me understand why these methods are declared.

These are just methods that ensure that changes to the state property trigger KVO notifications for properties isReady, isExecuting and isFinished. The KVO notifications of these three keys is critical for the correct functioning of asynchronous operations. Anyway, this syntax is outlined in the Key-Value Observing Programming Guide: Registering Dependent Keys.

The keyPathsForValuesAffectingValue method you found is related. You can either register dependent keys using that method, or have the individual methods as shown in your original code snippet.


BTW, here is a revised version of the AsynchronousOperation class you provided, namely:

  1. You must not call super.start(). As the start documentation says (emphasis added):

    If you are implementing a concurrent operation, you must override this method and use it to initiate your operation. Your custom implementation must not call super at any time.

  2. Add @objc required in Swift 4.

  3. Renamed execute to use main, which is the convention for Operation subclasses.

  4. It is inappropriate to declare isReady as a final property. Any subclass should have the right to further refine its isReady logic (though we admittedly rarely do so).

  5. Use #keyPath to make code a little more safe/robust.

  6. You don't need to do manual KVO notifications when using dynamic property. The manual calling of willChangeValue and didChangeValue is not needed in this example.

  7. Change finish so that it only moves to .finished state if not already finished.

Thus:

public class AsynchronousOperation: Operation {
    
    /// State for this operation.
    
    @objc private enum OperationState: Int {
        case ready
        case executing
        case finished
    }
    
    /// Concurrent queue for synchronizing access to `state`.
    
    private let stateQueue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".rw.state", attributes: .concurrent)
    
    /// Private backing stored property for `state`.
    
    private var _state: OperationState = .ready
    
    /// The state of the operation
    
    @objc private dynamic var state: OperationState {
        get { return stateQueue.sync { _state } }
        set { stateQueue.async(flags: .barrier) { self._state = newValue } }
    }
    
    // MARK: - Various `Operation` properties
    
    open         override var isReady:        Bool { return state == .ready && super.isReady }
    public final override var isExecuting:    Bool { return state == .executing }
    public final override var isFinished:     Bool { return state == .finished }
    public final override var isAsynchronous: Bool { return true }

    // KVN for dependent properties
    
    open override class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String> {
        if ["isReady", "isFinished", "isExecuting"].contains(key) {
            return [#keyPath(state)]
        }
        
        return super.keyPathsForValuesAffectingValue(forKey: key)
    }
    
    // Start
    
    public final override func start() {
        if isCancelled {
            state = .finished
            return
        }
        
        state = .executing
        
        main()
    }
    
    /// Subclasses must implement this to perform their work and they must not call `super`. The default implementation of this function throws an exception.
    
    open override func main() {
        fatalError("Subclasses must implement `main`.")
    }
    
    /// Call this function to finish an operation that is currently executing
    
    public final func finish() {
        if !isFinished { state = .finished }
    }
}
Community
  • 1
  • 1
Rob
  • 415,655
  • 72
  • 787
  • 1,044
  • Nice answer, except based on the documentation, for asynchronous operations one should override `start` instead of `main` (`main` is for synchronous operations). IMHO It would help to avoid confusion if you can remove the `main` override – user1046037 Jan 06 '18 at 14:58
  • 3
    I disagree. The [Concurrency Programming Guide](https://developer.apple.com/library/content/documentation/General/Conceptual/ConcurrencyProgrammingGuide/OperationObjects/OperationObjects.html#//apple_ref/doc/uid/TP40008091-CH101-SW16) encourages use of both `start` and `main` for asynchronous operations. In their discussion about `main` in conjunction with asynchronous operations they say, "Although you could perform the task in the `start` method, implementing the task using [`main`] can result in a cleaner separation of your setup and task code." – Rob Jan 06 '18 at 18:01
  • You are correct, thanks I was doing it in `start` for `asynchronous` operations but this is cleaner. – user1046037 Jan 07 '18 at 01:13
  • @Rob Thanks for sharing the *revised version* of the asynchronous operation. But as the programming guide states also that `main` is *optional* for concurrent operations in practice it makes no difference to put the task code into `main` or (custom) `execute`. The general workflow is the same. – vadian Jan 07 '18 at 09:48
  • 1
    @vadian - Agreed. As I said in my answer, I renamed `execute` to `main` merely because it was a matter of convention; I didn't do it because I thought one _had_ to. It just struck me as really strange to introduce a completely different method name when it's so common (and explicitly contemplated by Apple) to use `main` for this purpose. My comment above was just in reaction to user1046037's assumption that `main` was _only_ for synchronous operations. I wonder if a similar misconception drove calebd's decision to use `execute` in the first place... – Rob Jan 07 '18 at 10:36
  • @Rob I'm using that class in a few projects and personally I like the concept of `execute` and `finish` as a differentiation to (implicitly called) synchronous `main` and to reflect `isExecuting / isFinishing` and the corresponding states. – vadian Jan 07 '18 at 10:49
  • @vadian - Fine. I think that's a little weird given that Apple explicitly discusses using `main` for this purpose in the [_Concurrency Programming Guide: Configuring Operations for Concurrent Execution_](https://developer.apple.com/library/content/documentation/General/Conceptual/ConcurrencyProgrammingGuide/OperationObjects/OperationObjects.html#//apple_ref/doc/uid/TP40008091-CH101-SW8) and in [the documentation for `main`](https://developer.apple.com/documentation/foundation/operation/1407732-main), but to each his own. But it doesn't really matter. – Rob Jan 07 '18 at 10:54
  • @Rob Could you have a look at another Operation problem I am facing in https://stackoverflow.com/questions/48137896/operation-went-isfinished-yes-without-being-started-by-the-queue-it-is-in – user1046037 Jan 07 '18 at 14:08
  • @Rob: What does "rw" stands for in the label name of the queue? Read/Write? – Vin Gazoil Oct 12 '18 at 22:49
  • 2
    @VinGazoil - It stands for “reader-writer”, a synchronization pattern in which you use a concurrent queue, allowing concurrent reads, but using barrier for writes to ensure writes are properly synchronized. – Rob Oct 13 '18 at 20:41
  • @Rob - Thank you for this precision. – Vin Gazoil Oct 13 '18 at 22:49
  • Amazing answer! Thank you. – Thomás Pereira May 17 '19 at 22:31
  • @Rob Thanks for answering my other question. I'm still a bit confused about the `start()` method here. I subclassed the `AsynchronousOperation` class and created an [operation](https://pastebin.com/TaWyXSj3) for uploading files to S3. Do I manually have to call the `start()` method to begin the operation? Or does the `OperationQueue` take care of that? – Isuru Apr 11 '20 at 05:11
  • 1
    @Isuru - Perhaps you can post to somewhere like gist.github.com, where I can post comments/corrections? In answer to your question, the operation queue will call `start` for you. But your `main` implementation isn’t quite right... – Rob Apr 11 '20 at 05:26
  • @Rob Do you think it's possible to perform the thread safety with the new concurrency API of Swift 5.5? – vadian Dec 23 '21 at 08:43
  • @vadian We'd just make it an `actor` and you're done. It's an extremely elegant solution and very simple. I'm already adopting actors and async-await in my apps and love the elegance of the code. (Earlier I was worried about how slow it was, but that looks like a broader performance difference between `concurrentPerform` and `async`-`await` task groups, not the `actor`.) – Rob Dec 23 '21 at 09:32
  • @Rob Thanks, with using `actor`s you mean to replace the entire `Operation` subclass with an `actor` and `OperationQueue` with a `TaskGroup` or an `AsyncSequence` pattern. – vadian Dec 23 '21 at 09:47
  • As it relates to this question, where doing `async`-`await` (which I’d prefer to operations in most use-cases), ensuring thread-safety with actors is definitely the way to go. In those rare cases where I might still use `Operation`, I still stick with traditional synchronization mechanisms. (I personally do not use reader-writer, though. Other approaches are more performant and just as elegant.) – Rob Dec 23 '21 at 10:02
8

When using an updated code snippet from Rob's answer, one should be aware of possibility of a bug, caused by this change:

  1. Change finish so that it only moves to .finished state if isExecuting.

The above goes against Apple docs:

In addition to simply exiting when an operation is cancelled, it is also important that you move a cancelled operation to the appropriate final state. Specifically, if you manage the values for the finished and executing properties yourself (perhaps because you are implementing a concurrent operation), you must update those properties accordingly. Specifically, you must change the value returned by finished to YES and the value returned by executing to NO. You must make these changes even if the operation was cancelled before it started executing.

This will cause a bug in a few cases. For example, if Operation Queue with "maxConcurrentOperationCount = 1" gets 3 async operations A B and C, then if all operations are cancelled during A, C will not get executed and the queue will be stuck on operation B.

Roman Kozak
  • 101
  • 1
  • 4
3

About your first question: stateQueue lock your operation when writing a new value to you operation state by:

    return stateQueue.sync(execute: {
            rawState
    })

And

    stateQueue.sync(flags: .barrier, execute: {
        rawState = newValue
    })

as your operation is asynchronous so before read or write one state another state can be called. Like you want to write isExecution but in the mean time isFinished already called. So to avoid this scenario stateQueue lock the operation state to be read and write until it finished its previous call. Its work like Atomic. Rather use dispatch queue you can use an extension to NSLock to simplify executing critical code from Advanced NSOperations sample code in WWDC 2015 https://developer.apple.com/videos/play/wwdc2015/226/ from https://developer.apple.com/sample-code/wwdc/2015/downloads/Advanced-NSOperations.zip and you can implement like following:

private let stateLock = NSLock()

private dynamic var state: OperationState {
    get {
        return stateLock.withCriticalScope{ rawState } 
    }
    set {
        willChangeValue(forKey: "state")

        stateLock.withCriticalScope { 
            rawState = newValue
        }
        didChangeValue(forKey: "state")
    }
}

About your second question: Its a KVO notification for the read only property isReady, isExecuting, isFinished to manage the operation state. You can read this: http://nshipster.com/key-value-observing post till the end for better understanding about KVO.

Evana
  • 1,754
  • 13
  • 12