13

I have 1000 documents in a single collection in Cloud Firestore, is it possible to fetch random documents?

Say for example: Students is a collection in Firestore and I have 1000 students in that collection, my requirement is to pick 10 students randomnly on each call.

Alex Mamo
  • 130,605
  • 17
  • 163
  • 193
CLIFFORD P Y
  • 16,974
  • 6
  • 30
  • 45

5 Answers5

1

Yes it is and to achieve this, please use the following code:

FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
CollectionReference studentsCollectionReference = rootRef.collection("students");
studentsCollectionReference.get().addOnCompleteListener(new OnCompleteListener<QuerySnapshot>() {
    @Override
    public void onComplete(@NonNull Task<QuerySnapshot> task) {
        if (task.isSuccessful()) {
            List<Student> studentList = new ArrayList<>();
            for (DocumentSnapshot document : task.getResult()) {
                Student student = document.toObject(Student.class);
                studentList.add(student);
            }

            int studentListSize = studentList.size();
            List<Students> randomStudentList = new ArrayList<>();
            for(int i = 0; i < studentListSize; i++) {
                Student randomStudent = studentList.get(new Random().nextInt(studentListSize));
                if(!randomStudentList.contains(randomStudent)) {
                    randomStudentList.add(randomStudent);
                    if(randomStudentList.size() == 10) {
                        break;
                    }
                }
            }
        } else {
            Log.d(TAG, "Error getting documents: ", task.getException());
        }
    }
});

This is called the classic solution and you can use it for collections that contain only a few records but if you are afraid of getting huge number of reads then, I'll recommend you this second approach. This also involves a little change in your database by adding a new document that can hold an array with all student ids. So to get those random 10 students, you'll need to make only a get() call, which implies only a single read operation. Once you get that array, you can use the same algorithm and get those 10 random ids. Once you have those random ids, you can get the corresponding documents and add them to a list. In this way you perform only 10 more reads to get the actual random students. In total, there are only 11 document reads.

This practice is called denormalization (duplicating data) and is a common practice when it comes to Firebase. If you're new to NoSQL database, so for a better understanding, I recommend you see this video, Denormalization is normal with the Firebase Database. It's for Firebase realtime database but same principles apply to Cloud Firestore.

But rememebr, in the way you are adding the random products in this new created node, in the same way you need to remove them when there are not needed anymore.

To add a student id to an array simply use:

FieldValue.arrayUnion("yourArrayProperty")

And to remove a student id, please use:

FieldValue.arrayRemove("yourArrayProperty")

To get all 10 random students at once, you can use List<Task<DocumentSnapshot>> and then call Tasks.whenAllSuccess(tasks), as explained in my answer from this post:

Alex Mamo
  • 130,605
  • 17
  • 163
  • 193
1

As per Alex's answer I got hint to get duplicate records from Firebase Firestore Database (Specially for small amount of data)

I got some problems in his question as follow:

  • It gives all the records same as randomNumber is not updated.
  • It may have duplicate records in final list even we update randomNumber everytime.
  • It may have duplicate records which we are already displaying.

I have updated answer as follow:

    FirebaseFirestore database = FirebaseFirestore.getInstance();
    CollectionReference collection = database.collection(VIDEO_PATH);
    collection.get().addOnCompleteListener(new OnCompleteListener<QuerySnapshot>() {
        @Override
        public void onComplete(@NonNull Task<QuerySnapshot> task) {
            if (task.isSuccessful()) {
                List<VideoModel> videoModelList = new ArrayList<>();
                for (DocumentSnapshot document : Objects.requireNonNull(task.getResult())) {
                    VideoModel student = document.toObject(VideoModel.class);
                    videoModelList.add(student);
                }

                /* Get Size of Total Items */
                int size = videoModelList.size();
                /* Random Array List */
                ArrayList<VideoModel> randomVideoModels = new ArrayList<>();
                /* for-loop: It will loop all the data if you want 
                 * RANDOM + UNIQUE data.
                 * */
                for (int i = 0; i < size; i++) {
                    // Getting random number (inside loop just because every time we'll generate new number)
                    int randomNumber = new Random().nextInt(size);

                    VideoModel model = videoModelList.get(randomNumber);

                    // Check with current items whether its same or not
                    // It will helpful when you want to show related items excepting current item
                    if (!model.getTitle().equals(mTitle)) {
                        // Check whether current list is contains same item.
                        // May random number get similar again then its happens
                        if (!randomVideoModels.contains(model))
                            randomVideoModels.add(model);

                        // How many random items you want 
                        // I want 6 items so It will break loop if size will be 6.
                        if (randomVideoModels.size() == 6) break;
                    }
                }

                // Bind adapter
                if (randomVideoModels.size() > 0) {
                    adapter = new RelatedVideoAdapter(VideoPlayerActivity.this, randomVideoModels, VideoPlayerActivity.this);
                    binding.recyclerView.setAdapter(adapter);
                }
            } else {
                Log.d("TAG", "Error getting documents: ", task.getException());
            }
        }
    });

Hope this logic helps to all who has small amount of data and I don't think It will create any problem for 1000 to 5000 data.

Thank you.

Pratik Butani
  • 60,504
  • 58
  • 273
  • 437
0

I faced a similar problem (I only needed to get one random document every 24 hours or when users refresh the page manually but you can apply this solution on your case as well) and what worked for me was the following:

Technique

  1. Read a small list of documents for the first time, let's say from 1 to 10 documents (10 to 30 or 50 in your case).
  2. Select random document(s) based on a randomly generated number(s) within the range of the list of documents.
  3. Save the last id of the document you selected locally on the client device (maybe in shared preferences like I did).
  4. if you want a new random document(s), you will use the saved document id to start the process again (steps 1 to 3) after the saved document id which will exclude all documents that appeared before.
  5. Repeat the process until there are no more documents after the saved document id then start over again from the beginning assuming this is the first time you run this algorithm (by setting the saved document id to null and start the process again (steps 1 to 4).

Technique Pros and Cons

Pros:

  1. You can determine the jump size each time you get a new random document(s).
  2. No need to modify the original model class of your object.
  3. No need to modify the database that you already have or designed.
  4. No need to add a document in the collection and handle adding random id for each document when adding a new document to the collection like solution mentioned here.
  5. No need to load a big list of documents to just get one document or small-sized list of documents,
  6. Works well if you are using the auto-generated id by firestore (because the documents inside the collection are already slightly randomized)
  7. Works well if you want one random document or a small-sized random list of documents.
  8. Works on all platforms (including iOS, Android, Web).

Cons

  1. Handle saving the id of the document to use in the next request of getting random document(s) (which is better than handling a new field in each document or handling adding ids for each document in the collection to a new document in the main collection)
  2. May get some documents more than one time if the list is not large enough (in my case it wasn't a problem) and I didn't find any solution that is avoiding this case completely.

Implementation (kotlin on android):

var documentId = //get document id from shared preference (will be null if not set before)
getRandomDocument(documentId)

fun getRandomDocument(documentId: String?) {
    if (documentId == null) {
        val query = FirebaseFirestore.getInstance()
                .collection(COLLECTION_NAME)
                .limit(getLimitSize())
        loadDataWithQuery(query)
    } else {
        val docRef = FirebaseFirestore.getInstance()
                .collection(COLLECTION_NAME).document(documentId)
        docRef.get().addOnSuccessListener { documentSnapshot ->
            val query = FirebaseFirestore.getInstance()
                    .collection(COLLECTION_NAME)
                    .startAfter(documentSnapshot)
                    .limit(getLimitSize())
            loadDataWithQuery(query)
        }.addOnFailureListener { e ->
            // handle on failure
        }
    }
}

fun loadDataWithQuery(query: Query) {
    query.get().addOnSuccessListener { queryDocumentSnapshots ->
        val documents = queryDocumentSnapshots.documents
        if (documents.isNotEmpty() && documents[documents.size - 1].exists()) {
            //select one document from the loaded list (I selected the last document in the list)
            val snapshot = documents[documents.size - 1]
            var documentId = snapshot.id
            //SAVE the document id in shared preferences here
            //handle the random document here
        } else {
            //handle in case you reach to the end of the list of documents
            //so we start over again as this is the first time we get a random document
            //by calling getRandomDocument() with a null as a documentId
            getRandomDocument(null)
        }
    }
}

fun getLimitSize(): Long {
    val random = Random()
    val listLimit = 10
    return (random.nextInt(listLimit) + 1).toLong()
}

Rofaeil Ashaiaa
  • 663
  • 5
  • 9
0

Based on @ajzbc answer I wrote this for Unity3D and its working for me.

FirebaseFirestore db;

    void Start()
    {
        db = FirebaseFirestore.DefaultInstance;
    }

    public void GetRandomDocument()
    {

       Query query1 = db.Collection("Sports").WhereGreaterThanOrEqualTo(FieldPath.DocumentId, db.Collection("Sports").Document().Id).Limit(1);
       Query query2 = db.Collection("Sports").WhereLessThan(FieldPath.DocumentId, db.Collection("Sports").Document().Id).Limit(1);

        query1.GetSnapshotAsync().ContinueWithOnMainThread((querySnapshotTask1) =>
        {

             if(querySnapshotTask1.Result.Count > 0)
             {
                 foreach (DocumentSnapshot documentSnapshot in querySnapshotTask1.Result.Documents)
                 {
                     Debug.Log("Random ID: "+documentSnapshot.Id);
                 }
             } else
             {
                query2.GetSnapshotAsync().ContinueWithOnMainThread((querySnapshotTask2) =>
                {

                    foreach (DocumentSnapshot documentSnapshot in querySnapshotTask2.Result.Documents)
                    {
                        Debug.Log("Random ID: " + documentSnapshot.Id);
                    }

                });
             }
        });
    }
Jamshaid Alam
  • 515
  • 1
  • 9
  • 24
-1

A second approach as described by Alex Mamo would look similar to this:

  1. Get the array list with the stored document ids
  2. Get a number of strings (I stored the doc ids as string) from that list

In the code below you get 3 random and unique strings from the array and store it in a list, from where you can access the strings and make a query. I am using this code in a fragment:

    @Nullable
    @Override
    public View onCreateView(@NonNull LayoutInflater inflater, @Nullable ViewGroup container, @Nullable Bundle savedInstanceState) {

        View view = inflater.inflate(R.layout.fragment_category_selection, container, false);

        btnNavFragCat1 = view.findViewById(R.id.btn_category_1);

        btnNavFragCat1.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {

                questionKeyRef.document(tvCat1).get().addOnCompleteListener(new OnCompleteListener<DocumentSnapshot>() {
                    @Override
                    public void onComplete(@NonNull Task<DocumentSnapshot> task) {
                        if (task.isSuccessful()) {

                            DocumentSnapshot document = task.getResult();
                            List<String> questions = (List<String>) document.get("questions"); // This gets the array list from Firestore

                            List<String> randomList = getRandomElement(questions, 0);

                            removeDuplicates(randomList);

                            ...
                        }
                    }
                });

            }
        });

        ...

        return view;
    }

    private List<String> getRandomElement(List<String> list, int totalItems) {
        int PICK_RANDOM_STRING = 3;
        Random rand = new Random();
        List<String> newList = new ArrayList<>();
        int count = 0;
        while (count < PICK_RANDOM_STRING) {

            int randomIndex = rand.nextInt(list.size());
            String currentValue = list.get(randomIndex);
            if (!newList.contains(currentValue)) {
                newList.add(currentValue);
                count++;
            }
        }

        return newList;
    }

    private void removeDuplicates(List<String> list) {
        try {
            Log.e("One", list.get(0));
            Log.e("Two", list.get(1));
            Log.e("Three", list.get(2));

            query1 = list.get(0); // In this vars are the strings stored with them you can then make a normal query in Firestore to get the actual document
            query2 = list.get(1);
            query3 = list.get(2);
        } catch (Exception e) {
            e.printStackTrace();
        }

    }

Here is the array that I get from Firestore: enter image description here

Kaiser
  • 606
  • 8
  • 22
  • That's the whole point... Fetching all document from a collection is very expensive and not efficient! – genericUser Dec 13 '20 at 12:38
  • Making 1000s roundtrips to remote db instead of pre-caching 1000 records (and "Student" entity can be about 1000 bytes only); especially because Firestore is designed specifically to support offline access. You cannot use "offline access" if your queries are randomly generated; so you need to use different algos outside of "query API". Caching is the best solution, especially if user wants to (for example) exercise random quizzes offline. Pre-cache an array of IDs, randomly map it to random numbers, pick "top 10" from this array instead of unnecessarily hitting Firestore (and paying money). – Fuad Efendi Jan 16 '23 at 01:35
  • "That's the whole point... Fetching all document from a collection is very expensive and not efficient!" Fetshhing 1000s "static" text records, 1000 characters each, at application startup, will take about 100ms of compressed data, and Firestore is designed specifically for this use case (offline access on mobile devices). It will save you money. Exception for this rule is dynamic data, such as chat rooms, stock market data, etc.; and you don't need "randomness" for chronologically ordered data isn't it? Let's be pragmatic, solution by @Kaiser is the best for "random student" use case. – Fuad Efendi Jan 16 '23 at 01:44
  • As a performance optimization, you can have collection of Document IDs only, load it at startup (and pre-cache for offline access), keep history of which IDs you already shown on screen, create random mappings , resort, pick top 10, and so on. And generate explicit queries using these pre-calculated locally IDs. – Fuad Efendi Jan 16 '23 at 01:59