0

I am trying to figure out how to reduce reads on relatively static firestore collection.

The basics of my data structure:

  • users/{userId}
  • organistaions/{organisationId}/employees/{employeeId}

Each user will belong to one organisation. Users and employees are not referentially linked, but as more users join an organisation, the number of employee documents can be assumed to be roughly equal to the number of users in that organisation.

The collection of employees will not change often, but on exceptional days, may receive 100s of writes.

When a user opens the app, we will fetch the collection of employees associated with their organisation.

The problem I am facing, is that as the number of users (and hence employees) grow, the number of reads increases at a rate of N2. This is obviously problematic as an organisation with just 1,000 users will result in 1,000,000 reads if each user opens the app only once. Users open the app a dozen times a day, so this number can vary a fair bit.

My initial thinking was that I could fetch the collection of employees in a function and leverage caching on the CDN. This is problematic, though, as I need to share cache results between a large collection of users and there doesn't seem to be a particularly secure way to do this without opening the app up to leak employee collections. I need to vary the result by a user's organistaionId while also verifying their auth token.

I have considered caching the results on the client, but this will only cut the reads down to 1*N2 since each client will still need to fetch the collection at least once.

Other options include Redis or using something like Algolia to search through the results as needed. But both of these solutions seem to get expensive quite quickly.

Thanks in advance.

Shamsun
  • 9
  • 5
  • 1
    You've already discussed the two options you have available: CDN (or some shared cache) and client only cache. There's not really any other options other than simply reducing the number of documents to read. If you can't reduce the number of documents in total, then all you can do is offload the queries/reads to other software components as you've discussed. – Doug Stevenson Sep 28 '20 at 01:03

1 Answers1

0

When a user opens the app, we will fetch the collection of employees associated with their organisation.

This is the first thing I'd consider. Has the user indicated they want to see the employees of the organisation? If that runs in the thousands of documents, are they really going to see all of them?

More likely: you have a "home page" for the organisation, and on that home page you want to show the top (however you define "top") employees and other important information.

Instead of having each user read all users, and all the relevant other documents, I'd create a single document with all the content for this home page. That means each user is reading just a single document for this information, significantly reducing the number of reads needed.

Of course this comes at a cost of having to do more writes. So you should do some back up the napkin calculations to see if it pays off in your case, but my educated guess is that it does.

Also see:

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807