2

I'm learning Swift and I'm facing a problem in one of my model classes. What I'm trying to do is having a lazy-loaded property that can be "invalidated" when the data it is based on changes. (something like this: https://stackoverflow.com/a/25954243/2382892)

What I have now is something like this:

class DataSet {
    var entries: [Entry]

    var average: Double? {
        return self.entries.average
    }
}

Array<Entry>.average computes the average of a property of Entry, but returns nil if the array is empty.

Since this average could potentially be expensive to compute, I'd like to do that lazily and store a cached value of it and recalculate it only if necessary (when DataSet.entries is modified).

Now, following this answer, I should do something like this:

class DataSet {
    var entries: [Entry]
    var _average: Double?
    var average: Double? {
        if _average == nil {
            _average = self.entries.average
        }
        return _average
    }
}

However I have the problem of handling both the case that the average is to be recomputed, and when the array is empty and there's no meaningful average to return.

Since I know the average value will always be positive, I could use a default value (such as -1.0) to indicate the cache is no more valid, and nil to mean there are no entries, or vice-versa (nil means the average has to be computed again; -1.0 when there are no entries).

This however doesn't seem elegant, or the "Swift way" to achieve this behavior (or is it indeed?).

What should I do?

Community
  • 1
  • 1
Lorenzo Rossi
  • 613
  • 1
  • 8
  • 18
  • I'm afraid best practice questions aren't a good fit for Stack Overflow, as answers can only be based on opinion, not fact. – JAL Jun 30 '16 at 19:00
  • You can use a double optional – Alexander Jun 30 '16 at 19:06
  • By the way, in your example above, you declared `DataSet` computed property of `var average: Double { ... }`, But I assume that `entries.average` would return `nil` if there were no `Entry` objects. So I think that declaration should be `var average: Double? { ... }`. – Rob Jun 30 '16 at 20:02
  • From a practical point of view in almost all cases an average value 0.0 of an empty array is more suitable than `nil`. In the good old Objective-C days we got used to regard `nil` as *don't care* ;-) – vadian Jun 30 '16 at 20:04
  • 2
    @vadian - I disagree. `nil` means that there is no value and has the virtue that it can never be mistaken for a real calculated value. You're just using another magic value (and worse, one that is indistinguishable from a potentially valid value). Using `0` is no better than his initial `-1` approach. Besides, he explicitly tells us that "`Array.average` computes the average of a property of `Entry`, but returns `nil` if the array is empty" (which is the correct approach, IMHO). – Rob Jun 30 '16 at 20:22
  • @Rob When you're opening a bank account your starting balance is 0 not nil. For scalar types in most cases the absence of a value can be represented by zero - once again, from a practical rather than a programming semantics point of view ;-) – vadian Jun 30 '16 at 20:34
  • @vadian - Yeah, but if I walk into a Citibank and ask "hey, what's my balance on my credit card?", there's a big difference between "you have a balance of 0" and "you don't have a Citibank credit card". – Rob Jun 30 '16 at 20:43
  • @Rob Absolutely right, but there is an *Citibank credit card* represented by `entries`. I would understand the whole thing if `entries` could also be `nil`. – vadian Jun 30 '16 at 20:46
  • @Rob you are right, in the second example I forgot to make `average` optional. I edited my question. – Lorenzo Rossi Jul 01 '16 at 10:07
  • @vadian, In my case I need to distinguish the case in which the average is actually 0.0 (all entries are 0) from when there are no entries (or all entries have a weight equal to 0) as it should be displayed differently to the user. This is why I decided to use an optional. – Lorenzo Rossi Jul 01 '16 at 10:22

4 Answers4

2

You definitely shouldn't use magic values, like -1 to indicate some state. But I agree that you should not use nil to indicate both that "the cached value is invalidated and must be recalculated" as well as "the cached average has been calculated and is nil because there were zero Entry objects". The problem with the other solutions that suggest just setting the calculated value to nil is that it can't distinguish between the two states of "invalidated" and "calculated but nil" and may call entries.average even though you may have already done so. Admittedly, this is likely not computationally expensive, but it is conflating two very different states.

One Swift solution would be an enum:

class DataSet {
    private enum CachedValue {
        case Calculated(Double?)
        case Invalidated
    }

    var entries = [Entry]() {
        didSet {
            cachedAverage = .Invalidated
        }
    }

    private var cachedAverage: CachedValue = .Invalidated

    var average: Double? {
        switch cachedAverage {
        case .Calculated(let result):
            return result
        case .Invalidated:
            let result = entries.average
            cachedAverage = .Calculated(result)
            return result
        }
    }
}

This captures a distinct state between .Invalidated and .Calculated and recalculates lazily as needed.

Rob
  • 415,655
  • 72
  • 787
  • 1,044
  • 1
    All answers seem good, but @Rob's solution looks very simple and elegant, being also very similar to Swift Optional internal implementation (https://github.com/apple/swift/blob/master/stdlib/public/core/Optional.swift - http://stackoverflow.com/a/24549097/2382892) – Lorenzo Rossi Jul 01 '16 at 12:31
1

Use nil

First of all never use a specific value of your type domain to indicate the absence of value. In other words don't use a negative number to indicate no value. nil is the answer here.

So average should be declared as Double?.

Clearing the cache when entries does change

Next you need to clear your cache each time entries is mutated. You can use didSet for this.

class DataSet {
    private var entries: [Entry] = [] {
        didSet { cachedAverage = nil }
    }

    private var cachedAverage: Double?
    var average: Double? {
        if cachedAverage == nil {
            cachedAverage = self.entries.average
        }
        return cachedAverage
    }
}

When entries is empty

Finally if you believe that average for an empty array should be nil, then why don't you change accordingly the average computed property you added to SequenceType?

Luca Angeletti
  • 58,465
  • 13
  • 121
  • 148
  • 1
    You could also use `didSet` on the array to watch for mutations instead of making it private. – GriffeyDog Jun 30 '16 at 19:20
  • @GriffeyDog: You are right. I updated my answer, thanks! – Luca Angeletti Jun 30 '16 at 19:24
  • This looks the simplest and probably the fastest solution _in most cases_. However, as I said, `SequenceType.average` returns `nil` when the sequence is empty, and in that case `SequenceType.average` would be called every time `DataSet.average` is accessed. – Lorenzo Rossi Jul 01 '16 at 13:33
1

You can make use of didSet:

class DataSet {
    var entries: [Entry] {
        didSet {
            /// just mark the average as outdated, it will be
            /// recomputed when someone asks again for it
            averageOutOfDate = true
        }
    }

    /// tells us if we should recompute, by default true
    var averageOutOfDate = true
    /// cached value that avoid the expensive computation
    var cachedAverage: Double? = nil

    var average: Double? {
        if averageOutOfDate {
            cachedAverage = self.entries.average
            averageOutOfDate = false
        }
        return cachedAverage
    }
}

Basically whenever the entries property value changes, you mark the cached value as outdated, and you use this flag to know when to recompute it.

Cristik
  • 30,989
  • 25
  • 91
  • 127
1

appzYourLife's answer is a nice concrete solution. However, I suggest a differ approach all-together. Here's what I would do:

I would make a protocol that defines all the important bits that would need to be access externally.

I would then make 2 structs/classes conform to this protocol.

The first struct/class will be a caching layer.

The second struct/class will be the actual implementation, which will only ever be accessed by the caching layer.

The caching layer will have a private instance of the actual implementing struct/class, and will have a variable such as "isCacheValid", which will be set to false by all mutating operations that invalidate the underlying data (and by extension, computed values such as the average).

This design makes it so that the actual implementing struct/class is fairly simple, and completely agnostic to the caching.

The caching layer does all the caching duties, completely agnostic to details of how the cached values are computed (as their computation is just delegated to the implementing class.

Community
  • 1
  • 1
Alexander
  • 59,041
  • 12
  • 98
  • 151