0

I'm using the following code to retrieve a random document from a firestore collection:

    collection = db.collection('Items')
    total_docs = collection.count().get()[0][0].value

    random_offset = random.randint(0, total_docs - 1)
    random_doc = collection.limit(1).offset(random_offset).get()[0]

I noticed that this code produces a lot of read-usage and discovered that the entire collection is read when counting the documents.

Hence my question: How can retrieve the count of documents in a collection without reading the entire collection?

And if this isn't possible: How can I randomly retrieve a document from a collection without specifying how many documents the collection contains?

Many thanks!

Comfort Eagle
  • 2,112
  • 2
  • 22
  • 44
  • `collection.count()` doesn't read every document in the collection. See: https://stackoverflow.com/questions/46554091/cloud-firestore-collection-count – Doug Stevenson Jun 22 '23 at 20:43
  • 1
    It's not the `count()` statement that causes those reads, but the `offset(random_offset)` that you use. Currently `offset()` reads all documents that you ask it to skip, it just doesn't return them - and. you get charged for those reads. – Frank van Puffelen Jun 23 '23 at 06:11
  • When it comes to counting, I think that this [resource](https://medium.com/firebase-tips-tricks/how-to-count-the-number-of-documents-in-a-firestore-collection-3bd0c719978f) will help. – Alex Mamo Jun 26 '23 at 13:50

1 Answers1

0

One approach you can take is normalizing your collection into a single doc. Essentially, you create a new doc within Items called _index or something of the sort.

This can have a field called count which you increment/decrement with FieldValue.increment whenever you modify any other document in your collection. (Alternatively, you could omit this field and use len(keys).

You can have another field called keys which you modify with FieldValue.arrayUnion/FieldValue.arrayRemove whenever a new document is created/removed based on the doc ID.

While this adds another write operation for each new doc, it also gives you a much simpler process for reading in a large list of doc IDs.

From here, you can read this index doc and pick a random item.

/Items
  /_index
    - count: 2
    - keys: ["docId123", "docId456"]
  /docId123
    - foo: "bar"
  /docId456
    - foo: "baz"
Nick Felker
  • 11,536
  • 1
  • 21
  • 35