0

I am currently dealing with pagination with addSnapshotListener in Firebase's firestore and it appears there's no easy way to implement Snapshot with pagination.

Original premise: I started the implementation with addSnapshotListener as follows:

db.collectionGroup("images")
            .order(by: "createdTime")
            .whereField("featured", isEqualTo: true)
            .addSnapshotListener { querySnapshot, _ in
                if let querySnapshot = querySnapshot {
                    vm.featuredImages = querySnapshot.documents.compactMap { document in
                        do {
                            let image = try document.data(as: ImageModel.self)
                            print("DEBUG FEATURED IMAGE: \(String(describing: image))")
                            return image
                        } catch {
                            print("DEBUG FEATURED IMAGE ERROR \(String(describing: error))")
                        }
                        return nil
                    }
                }
            }

And all goes well. The data is fetched into the ViewModel and any new changes are automatically notified via Firestore’s library and the local model gets updated.

Now add pagination: I’ve read the official documentation as well as all the stackoverflow threads. It appears there is no easy to maintain a addSnapshotListener with a new page.

(A) One naive approach when a new page is requested would be to

  1. Keep track of the old listener, and then unregisters the old one
  2. Register a new snapshotListener now with a new page (10 -> 20 elements)
  3. Repopulate the local model

This seems to work ok on the surface however with the one big problem is that you would be re-fetching the first 10 when you request for page 2. And the fetches become exponential as you add pages!

(B) Another solution mentioned in Firebase’s official youtube is

  1. Keep the old listener, but keep adding a new listener per new page
  2. On first fetch, it’s easy, you would just dump the new data into the old local model
  3. But when things update, it’s a lot of manual work. You would have to either diff the new data vs the old local model or somehow find a way to coordinate all the listeners and merging them into a new modal .

I imagine querySnapshot is the standard way of keep data in sync with apps. I imagine every app is going to need pagination. What is the correct solution?

jnpdx
  • 45,847
  • 6
  • 64
  • 94
erotsppa
  • 14,248
  • 33
  • 123
  • 181
  • 1
    (1/2) How simple or complex snapshot-listener pagination is depends on the architecture of the data. The inherent problem with snapshot-listener pagination is that the listener only listens for changes to the first page of results, which means that if the user has loaded any additional pages and there is a change to a document in those pages, the UI will never update. The workaround here is using a hidden trigger document, which may or may not be very straightforward given the physical read/write limitations of Firestore. This also depends on your data architecture. – trndjc Mar 13 '23 at 17:24
  • 1
    (2/2) However, if any change to this collection of documents will always change the first page then things are much simpler. In your use case, when there are any changes to the collection, will there always be a change to the first page or not? This is the starting point. – trndjc Mar 13 '23 at 17:24
  • 1
    @trndjc no it doesn't necessarily change. For example. Listener for the first 20. Then pages of 20 follows. Now someone deletes the 25th element, the first 20 doesn't change. So imagine a chat page. A moderator could delete a older message. How is this use case not supported? Firestore is a 10 year old product. – erotsppa Mar 13 '23 at 20:39
  • 1
    Then consider using a trigger document that is always on the first page (that is obviously not rendered in the UI). And whenever an operation is taken on the collection that doesn't guarantee a change of the first page, modify this trigger document to invoke the snapshot listener. – trndjc Mar 13 '23 at 20:53
  • 1
    Im not sure how that helps. Can you post an answer? Requirements: (1) Only 20 is fetched on page load. (2) every page there after is fetched at 20 increment (3) every single rendered element should stay in sync with the server at all times (that includes addition/deletion/modification. – erotsppa Mar 13 '23 at 21:00
  • 1
    How are the documents sorted? And are they always sorted in the same way? – trndjc Mar 13 '23 at 21:01
  • 1
    Yes sorted always by createdTime, just like a chat screen. @trndjc – erotsppa Mar 13 '23 at 21:49
  • 1
    Then consider creating a document in this collection with a timestamp so far in the future that it will always be the first result and thus always in the scope of the snapshot listener. And so when the collection is modified in a way that doesn't guarantee a snapshot change, such as deleting a document, perform that delete atomically in a batch write alongside a write that modifies this trigger document. You don't need to modify this document when new chats appear since those will always be in the scope of the listener but for operations like delete, you would need to. This is one approach. – trndjc Mar 13 '23 at 22:44
  • 1
    I don’t know what this trigger document has to do with anything. When you fetch for the new 20, it won’t be sync’d. It’s really as simple as that. – erotsppa Mar 13 '23 at 22:50
  • 1
    You aren't thinking about this the right way. You have to understand what triggers the snapshot listener. The listener is always triggered when a new message enters the chat, no matter what page it's on and no matter how many pages are rendered in the UI. Therefore, you only need to account for document deletions. And if you modify the trigger document on the first page atomically with the delete, you can delete any document on any page and it will invoke the listener. – trndjc Mar 13 '23 at 22:59
  • 1
    Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/252496/discussion-between-erotsppa-and-trndjc). – erotsppa Mar 13 '23 at 23:14

2 Answers2

1

A. First part is to handle new documents and display document for the first time during the pagination:

  1. Use a snapshot listener to pull the latest documents, say the last 10
  2. When you detect that the user has seen all the docs you have already loaded, load the next set of documents (say next 20) with a new get() query, separate from the listener

The difficult part here is to handle the case when your app goes offline

  1. When back online you have to detect whether more documents have been added than your listener is covering
  2. If not more then just append these to the start
  3. If more then you have to perform a new query to get the missing ones (and paginate it if say that's 1000 docs). So you can end up with many "holes" in your local list of documents for when you app was offline

B. Secondly, you have to handle updates on documents that are not part of the snapshot listener (all the one after the first 10).

  1. In my app, I am making a get query to the backend every time a new such document is scrolled to, checking if it has been upddated since the last time it was loaded. I have optimized this get query as follows:
  • whenever a document is updated (including deletions) a field time_updated is set

  • when a document is fetched my app stores locally the document's time_updated as last_pulled_updated

  • when a user scrolls back to a document my app will fetch it as well as the next 9 with the following query:

    .order(by: "time_updated", descending: true)
    .whereField("time_updated", isGreaterThan: min_last_pulled_updated) // Minimum of last_pulled_updated over the 10 docs
    .whereField("featured", isEqualTo: true)
    .whereField("docid", in: [docid1,docid2,docid3,docid4,docid5,docid6,docid7,docid8,docid9,docid10])
    
  • this costs at most 10 reads and may cost only 1 read if one or no document is returned

  1. If the user is viewing a given document for some time, I am creating a snapshot listener for it (and the last 4 + next 5 like in 1 for optimisation purpose) to be able to update likes and comments on it in live (it is a social media app)

C. Deletions. By deletion I mean rendered innaccessible to the end user, not deleted from Firestore. So a "deleted" document could simply have a field deleted set to true and the listener and pagination query will both contain.whereField("deleted", isEqualTo: false). You can use TTL to have it really deleted later on.

  1. Beyond the first 10, this is taken care of by B1.
  2. Within the listener you can deduce when a doc has been deleted: 4th doc disappear, can only be because it was deleted. For the 10th doc there is a special case when a new doc is created and the 10th doc is deleted at the same time - then you cannot know from the listener only if it was deleted or if it was simply pushed out of the listener. For that purpose, in my app, I do an additional query like in B1 every 10 new documents for the documents 11th to 20th (this case is very rare in my app)
l1b3rty
  • 3,333
  • 14
  • 32
  • 1
    Not only the problem you mentioned, you also don't have sync for the next 20 you fetched. – erotsppa Mar 12 '23 at 17:40
  • 1
    Do you mean if some document in the next 20 has been modified? – l1b3rty Mar 14 '23 at 14:38
  • 1
    I have added what I am doing in my app for updated documents – l1b3rty Mar 14 '23 at 14:49
  • 1
    How would you know if a document outside your first 10 is even updated? Or are you saying you call the get whenever it is scrolled to regardless? That’s a horrible design and costs a ton. – erotsppa Mar 14 '23 at 19:50
  • 1
    It costs you at most 1 read every time you see a document outside of the snapshot – l1b3rty Mar 14 '23 at 20:10
  • How? If you fetched 20 you have to fetch 20 again to know what’s changed – erotsppa Mar 14 '23 at 20:42
  • I added the details in B1 of my answer – l1b3rty Mar 15 '23 at 13:17
  • ok that part makes sense but it won't do deletion beyond the first 10. So it's not ideal. Also, it's still not a solution that would even keep the first 10 in sync. See my explanation at end of chat here https://chat.stackoverflow.com/rooms/252496/discussion-between-erotsppa-and-trndjc So basically you have to do a TON of work for a half complete solution. Why is firestore like this? Who uses it and why can't it do the most basic use case of keep a front end list in sync? There are JS libraries out there that does this out of the box! Firestore is a 10 year product! – erotsppa Mar 15 '23 at 14:00
  • `time_updated` is set for deletions too, I missed that. I have added a section for deletions – l1b3rty Mar 15 '23 at 14:14
  • It cannot be deduced if the doc deleted is the last doc of the 10 – erotsppa Mar 15 '23 at 14:48
  • How does B1 take care of deletion? If the doc is deleted, it wont be returned by the query and you wont know if it's because its deleted or if it just didnt update – erotsppa Mar 15 '23 at 14:51
  • Ok, sorry for the misunderstanding. I added some more details on what I mean by "deleted" – l1b3rty Mar 15 '23 at 19:51
  • And I added more details for doc number 10. Had to go back to the code of my app to see how this was done. And this is getting complex indeed... – l1b3rty Mar 15 '23 at 20:03
  • Wow so you do soft delete. I mean at this point, is this really the solution for this kind of thing? The amount of work is insane, it would be faster to write a javascript JIT compiler and use an existing JS library to sync than to use firestore. – erotsppa Mar 15 '23 at 20:38
  • It may not be the right solution. And this is without even mentionning Firestore's query limitations! But I choose to stick with it to avoid a re-write and as the rest of my app fits better with Firestore. – l1b3rty Mar 16 '23 at 08:03
-1

I always done this with solution (A) you mentioned.

This seems to work ok on the surface however with the one big problem is that you would be re-fetching the first 10 when you request for page 2. And the fetches become exponential as you add pages!

This only happend when you using .get, But if you are attach listener to it you only get charged from those documents that haven't fetched yet. See this question,

flutroid
  • 1,166
  • 8
  • 24