1

I noticed that the NoSQL solutions provided by giants such as Google NoSQL, Firestore, and DynamoDB don't have join queries (I'm not including MongoDB and RethinkDB in this).

I like the idea of using those for the simplicity and cost benefits though.

I love JSON style document databases and don't want to go back to the dark days of VARCHAR etc. I'm wondering therefore how viable building 99% of databases for the web are with them.

Take this scenario:

User can be a member of many projects, they can see their projects on a Project List page, and drill down to a Project Details page.

Relationally this would be simple:

user: { ..., projects: [projectId1, ...] }
project: { ..., users: [userId1, ...] }

You could then make a join to show some basic details on Project List etc.

But what about with non-relational NoSQL? How would we denormalise this?

Would you simply have:

user: { no project IDs },
project: { users: [userId1, ...] }

And then for either the Project List page or Project Details page you simply run a filter on every project and return the ones the user is a member of? Is this what NoSQL thrives at in terms of efficiency? I am not dealing with huge datasets here I expect projects to be <10k and not be large docs. Thanks.

Dominic
  • 62,658
  • 20
  • 139
  • 163
  • So you are interested in a Firestore schema or an Amazon DynamoDB schema? – Alex Mamo Jan 05 '19 at 12:19
  • Correct me if I'm wrong but I feel this is a general problem with join-less NoSQL data models, I would accept either answer I haven't quite decided who to choose yet – Dominic Jan 05 '19 at 12:26
  • I removed DynamoDB tag as it's more likely to be firestore – Dominic Jan 05 '19 at 12:28
  • 1
    Yes, that's correct but there are workarounds that can solve this issue. I'm asking you which one you want to use because while both are NoSQL databases, Firestore is a little different. It uses collections and documents to store data. Firebase realtime database is more apropiate to DynamoDB. So would you like me to provide a schema for a Cloud Firestore database? – Alex Mamo Jan 05 '19 at 12:31
  • Yeah that would be great @AlexMamo – Dominic Jan 05 '19 at 12:31
  • Ok, I'll write you an answer for that. – Alex Mamo Jan 05 '19 at 12:33

1 Answers1

2

A possible database schema for your app use-case might be:

Firestore-root
   |
   --- users (collection)
   |    |
   |    --- userId (document)
   |         |
   |         --- userName: "John"
   |         |
   |         --- emailAddress: "john@gmail.com"
   |         |
   |         --- uid: "uidOfTheUser"
   |         |
   |         --- //other user data
   |
   --- projects (collection)
         |
         --- projectId (document)
               |
               --- projectName: "ProjectName"
               |
               --- projectId: "projectId"
               |
               --- users: ["uidOfTheUser", "uidOfTheUser"]

Using this database schema you can simply:

  • Get all projects of all users using just a CollectionReference:

    FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
    CollectionReference projectsRef = rootRef.collection("projects");
    projectsRef.get().addOnCompleteListener(/* ... */);
    
  • Get all projects that correpond to a single user:

    FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
    Query query = rootRef.collection("projects").whereArrayContains("users", uid);
    query.get().addOnCompleteListener(/* ... */);
    

How would we denormalise this?

In this case, there is no need to denormalise data. However, this denormalization practice is a common practice when it comes to Firebase. If you are new to NoQSL databases, I recommend you see this video, Denormalization is normal with the Firebase Database for a better understanding. It is for Firebase realtime database but same rules apply to Cloud Firestore.

Also, when you are duplicating data, there is one thing that need to keep in mind. In the same way you are adding data, you need to maintain it. With other words, if you want to update/detele an item, you need to do it in every place that it exists.

Please also see:

As @FrankvanPuffelen mentioned in his comment, please also see the following answers:

Alex Mamo
  • 130,605
  • 17
  • 163
  • 193
  • Thanks and the links are helpful too, this is basically what I was thinking was the approach to take but good to see it confirmed – Dominic Jan 05 '19 at 12:57
  • 2
    Great answer Alex. I'd always also include the list of projects for each user, since you're bound to need that at some point. Some further recommended reading, which, while for Firebase Realtime Database, is equally relevant here: https://stackoverflow.com/questions/41527058/many-to-many-relationship-in-firebase, https://stackoverflow.com/questions/16638660/firebase-data-structure-and-url/16651115#16651115 – Frank van Puffelen Jan 05 '19 at 14:41
  • @FrankvanPuffelen Yey!! The [second](https://stackoverflow.com/questions/16638660/firebase-data-structure-and-url/16651115#16651115) answer is the one where you got the congratulation from Andrew Lee back in 2013, waw! Just updated my answer with those two links. Thanks puf! :) – Alex Mamo Jan 06 '19 at 09:44