10

This is a question about the latest Firebase Cloud Firestore.In this doc it says like this:

It also allows for expressive queries. Queries scale with the size of your result set, not the size of your data set, so you'll get the same performance fetching 1 result from a set of 100, or 100,000,000.

This statement is not clear for me. Can you explain little bit more about this use case?

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
Sampath
  • 63,341
  • 64
  • 307
  • 441

2 Answers2

29

firebaser here

In most databases (including Firebase's own realtime database), query performance depends on a combination of the number of items you request and the size of the collection you request the items from.

So on most databases:

  1. If you request 10 items out of 1 million items, that will be faster than if you request 1000 items out of 1 million items.
  2. If you request 10 items out of 1 million items, that will be faster than if you request 10 items out of 100 million items.

The performance difference for #1 is expected, the data transfer alone is something that's hard to forget. Since #2 depends on the server-side processing, developers sometimes forget about #2. Many relational DBMS optimize very nicely, meaning the performance difference is often a logarithmic performance difference. But with a sufficiently large collection size, even log(n) performance is going to noticeable.

Cloud Firestore scales horizontally, which means that rule #2 from above doesn't apply:

  • If you request 10 items out of 1 million items, it will take the same time as requesting 10 items out of 100 million items.

This is because of the way Firestore's query system is designed. While you may not be able to model every query directly from a relational data model to the Firestore data model, if you can define your use-cases in terms of a Firestore query it is guaranteed to execute in a time relative only to the number of results you request. (paraphrasing Gil's comment here)

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807
  • 6
    Another way of saying this is that Firestore queries are designed to avoid excess work on the server. Other database systems, especially SQL-based ones, allow much more complex queries but come with the possibility that a query will blow up and performance will tank. This statement is a kind of a promise: if you can express the query Firestore will only do work proportional to the size of the result set. – Gil Gilbert Oct 06 '17 at 15:42
  • 1
    Good one Gil. I added a paragraph to my answer to explain. – Frank van Puffelen Oct 06 '17 at 16:53
14

This may be written a bit confusing. It is not a use-case in the classic sense its just a statement about the performance of Firestore.

It basically says that it does not matter if you request 1 item out of a 100 or 1 item out of 100.000.000, it will be equally fast. Here 1 is your result set and 100/100.000.000 is your data set. So requesting 1 item out of 100.000.000 will be faster than requesting 50 items out of 100.

I hope this makes it a bit clearer!

David
  • 7,387
  • 3
  • 22
  • 39
  • That part is OK. But what about this `Queries scale with the size of your result set, not the size of your data set`? – Sampath Oct 06 '17 at 12:44
  • 2
    Thats what my last sentence should explain. Your queries get more expensive if you request more items (result set) but not if there is a bigger data set available. As long as you always request the same number of items the query will always be equally performant no matter how big your data set is. – David Oct 06 '17 at 13:14