3

I have a collection of product objects (title, desc, price, quant, urlString, etc) in a Firestore collection. Currently around 1000 items, but that could go to 10k. On my iOS app launch, I setup a collection listener (db.collection("products").rx.listen()) which then saves changes to a local Realm database.

.subscribe(onNext: { querySnapshot in
     querySnapshot.documentChanges.forEach { docChange in
                autoreleasepool {
                    let realm = try! Realm(configuration: Realm.Configuration.defaultConfiguration)
                    let newData = docChange.document.data()
                    if let item = itemFactory.createItem(using: newData) {
                        if (docChange.type == .added) {
                            //realm.add(item)
                        }
                        if (docChange.type == .modified) {
                            //realm.update(item)
                        }
                        if (docChange.type == .removed) {
                            //realm.delete(item)
                        }
                    }
                }
            }
        }, onError: { error in
            print("Error fetching snapshots: \(error)")
        }).disposed(by: disposeBag)

I've read the firestore docs in detail but I'm not 100% confident this approach is reliable or performant.

Question: When the app launches, will Firestore download all 10k documents each time, before describing the changes? Or will it cache all 10k the very first time then only download changes on subsequent launches. I'm confident once a change event has fired, all subsequent events will only pick up changes to the Firestore database. What I'm concerned about is on first subscribing to the listener when the app launches, it downloads all 10k items (which would be costly).

EDIT 9 Jan 2019:

I ended up implementing @zavtra elegant answer with code roughly looking like this:

var newestUpdatedAt = UserDefaults.standard.double(forKey: kUDItemUpdatedAt)
//...
db.collection(kProducts)
            .whereField(kUpdatedAt, isGreaterThan: newestUpdatedAt)
            .rx.listen()
//...

querySnapshot.documentChanges.forEach { docChange in
            autoreleasepool {
                let realm = try! Realm(configuration: Realm.Configuration.defaultConfiguration)
                let newData = docChange.document.data()
                if let item = itemFactory.createItem(using: newData) {
                    if item.updatedAt > newestUpdatedAt {
                       newestUpdatedAt = item.updatedAt
                    }
                    if (docChange.type == .added) {
                        //realm.add(item)
                    }
                    if (docChange.type == .modified) {
                        //realm.update(item)
                    }
                    if (docChange.type == .removed) {
                        //realm.delete(item)
                    }
                }
            }
        }
        UserDefaults.standard.set(newestUpdatedAt, forKey: kUDItemUpdatedAt)
elprl
  • 1,940
  • 25
  • 36

2 Answers2

3

According to the docs:

docChanges returns an array of the document changes since the last snapshot. If this is the first snapshot, all documents will be in the list as "added" changes.

Every time you re-start the app will trigger this "first snapshot" behavior. If you want to get around this behavior, you would have to:

  1. Retrieve the most recent document saved locally, with its timestamp.
  2. Build a query where all documents start at that timestamp (i.e., every document's timestamp is at a minimum, the most recently saved timestamp)
  3. Subscribe to changes on that query on app entry.

To do this, you will have to add a timestamp field to each document, and an "indexOn" rule in your firestore rules on the timestamp field in order to prevent client-side downloading and sorting of the entire collection.

Gabriel Garrett
  • 2,087
  • 6
  • 27
  • 45
  • Can I assume you are referring to an Index Exemption on that timestamp field? – elprl Jan 08 '19 at 17:51
  • 1
    No, I mean creating an index. https://firebase.google.com/docs/firestore/query-data/indexing While Firestore is supposed to automatically manage indexing, I've found myself having to manually add a few indexOn rules. You could skip the process and see what the results are. Firestore will create a log in your iOS app if an index needs to be created and provide a link for you to go to that implements the rule. – Gabriel Garrett Jan 08 '19 at 17:57
  • How do you know when an old document is deleted when the new listener is looking for changes from a certain date? Unless you use soft deletes. – elprl Jan 08 '19 at 18:30
  • 1
    You would have to add a field on the collection that acts as a flag for if documents have been deleted since the last access, and a second field with an array of document ID's to delete locally. Then you could update the flag once those documents have been deleted locally. – Gabriel Garrett Jan 08 '19 at 18:55
  • Yes, I'm thinking a soft delete (isDeleted field) with perhaps logic to remove those documents with a cloud function scheduled every month. Then any app that hasn't been opened after a month just resyncs from scratch. – elprl Jan 08 '19 at 19:08
  • Thanks again zavtra for your answer. I've updated my question with what I implemented. I used an epoch number "updatedAt" field and an "isDeleted" boolean on my firestore objects. – elprl Jan 09 '19 at 12:43
3

When the app launches, will Firestore download all 10k documents each time, before describing the changes?

When you are listening for changes in Cloud Firestore for realtime changes, using Firestore Query's addSnapshotListener() method, it:

Starts listening to this query.

Which basically means that first time you attach the listener, you get all documents that correspond to that particular query.

Or will it cache all 10k the very first time then only download changes on subsequent launches.

Because Firestore has offline persistence enabled by default it means that once you perform a query, the results are chached on user's device. Furthermore, everytime a property within a document changes, you are notified according to that change. Obviously, this is happening only if the listener remains active and is not removed. So if nothing in the database is changed, you get the entire data from the cache.

As also @zavtra mentioned in his answer, you can add add under each object from your collection a Date property (this is how you can add it) and query your database on client, according to this new property, for all documents that have changed since a previous time.

I also recommend see Doug Stevenson's answer from this post, for a better understanding.

Alex Mamo
  • 130,605
  • 17
  • 163
  • 193
  • 1
    Hi elprl! Is there everything alright, can I help you with other informations? – Alex Mamo Jan 09 '19 at 07:07
  • I implemented @zavtra answer and seems to be working great so far. Didn't have to create a firestore rule. I'll update my question with some code. – elprl Jan 09 '19 at 12:25
  • 1
    Thanks for your answer Alex. I read those posts with interest. – elprl Jan 09 '19 at 12:44