How can I model a many-to-many relationship in Firestore without exceeding the document size limit?

Question

Below is my database schema that stores a many-to-many relationship between a task and tag model. Google state that the maximum size that a document can be stored on Firestore is 1 MiB. If I continuously add tags to a task the document size would exceed that size limit.

Firestore-root
    |
    --- tasks (collection)
    |     |
    |     --- taskID (document)
    |           |
    |           --- title: "Go for a cycle"
    |           |
    |           --- completed: false
    |           |
    |           --- userID: "zaEh95kXJKapyVUqrPws58dyRIC3"
    |           |
    |           --- tagIDs: ["rWqTxB01TK9w8KRo2GHD"]
    |           |
    |           --- // Other task properties
    |
    --- tags (collection)
          |
          --- tagID (document)
                |
                --- title: "Health"
                |
                --- userID: "zaEh95kXJKapyVUqrPws58dyRIC3"
                |
                --- colour: "red"
                |
                --- // Other tag properties

A solution that I have found to work is to create a junction table, however every time I navigate to the detail view of a task I have to query the database to find those relationships which in return drives up billing costs. Can’t help but feel as though I am caught between a rock and a hard place.

Related / follow-up Q&As

@AlexMamo My dilemma still exists. What I don't like about the junction table approach is the fact that when I tap to see the details of a task a query is sent to Firestore to retrieve the tags associated with it but then the user can dismiss out of the detail view and tap the exact same task and the query is sent again to retrieve the same tags therefore creating this sense of needless billing costs. — , Jul 16 '22 at 05:58

score 2 · Accepted Answer · answered Jul 16 '22 at 06:55

2

when I tap to see the details of a task a query is sent to Firestore to retrieve the tags associated with it.

Since you store the data in two different collections, yes, two different queries are needed. One to get the tasks and the second one to get the corresponding tags data. But that's not bad.

However, there are some other options that you can take into consideration. The first one would be, instead of saving the IDs of tags into an array, just save the actual data, meaning the entire "Tag" object. Or at least the important data of the tag. This practice is called denormalization. If you're new to the NoSQL databases, please note that this practice is quite common when it comes to Firebase. Also bear in mind that when you are duplicating data, there is one thing that you should know about. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete an item, you need to do it in every place that it exists.

When using the above solution, note that there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:

Maximum size for a document: 1 MiB (1,048,576 bytes)

As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text (tag IDs), you can store pretty much. I doubt you'll reach the limitation but as your arrays get bigger, be careful about this constraint. A workaround for this would be to create another document and another document for storing the tags. But also note, that besides the number of reads, you are also charged with the bandwidth needed to download the documents.

So it's up to you to decide which solution works best for your application.

answered Jul 16 '22 at 06:55

Alex Mamo

130,605
17
163
193

At first I had my tags in my application as just strings that were stored alongside the rest of the task properties. I wanted to add the ability for colours to be attached to a specific tag. If a tag is reused then I would expect the same colour to appear which lead me to separating tags and tasks into their own collections. This issue that I have raised appears to be very awkward for NoSQL platforms like Firebase and the general consensus online appears to recommend using a SQL database. I think I will have to revert back to what I had and follow the process of denormalization you mentioned. – Jul 16 '22 at 09:13
Storing tags as strings might also work. So you can also go ahead with that. – Alex Mamo Jul 16 '22 at 09:32
If instead of opting to store a string, I decide to store the Tag object. It starts to not seem practical when a tag has properties that explicitly define what an individual tag looks like. I would have to like you say update every document where this particular tag exists. I would like to steer away from this solution and focus on others that don't appear to have much documentation like this document for storing tags. Are you able to elaborate more about this? – Jul 16 '22 at 09:54
I'm not sure what to elaborate more. As long as you decided to use the entire object then you'll always have all the data that you need. Btw, in general, the tags are not things that are changed very often. So I think it won't be a problem to update a single tag once in a while. – Alex Mamo Jul 16 '22 at 09:56
So can I help you with other information? – Alex Mamo Jul 16 '22 at 09:57
1

I should be good to go. Thank you for all the help you have provided. I actually happened to stumble across an answer you provided someone regarding denormalization which I have put a link to it in the related section. I have decided to follow your first solution. – Jul 16 '22 at 10:36
Good to hear that. Good luck. – Alex Mamo Jul 16 '22 at 10:55

How can I model a many-to-many relationship in Firestore without exceeding the document size limit?

Related / follow-up Q&As

1 Answers1

Linked

Related