2

Assume that I want to write a blogging app. Should I prefer one of the following two options? I would prefer to have as much "single source of truth" as possible, but I am still not sure whether that preference comes from my background in SQL.

Option 1 (Denormalization):

Posts: {
  post_1: {
    title: "hello",
    body: "hi there!",
    uid: "user_1",
    comments: {
      comment_1: {
        body: "hi I commented",
        uid: "user_2",
      },
      comment_2: {
        body: "bye I commented",
        uid: "user_2",
      },
    }
  }
}

Users: {
  user_1: {
    uid: "user_1",
    post_1: {
      title: "hello",
      body: "hi there!",
      uid: "user_1",
      comments: {
        comment_1: {
          body: "hi I commented",
          uid: "user_2",
        },
        comment_2: {
          body: "bye I commented",
          uid: "user_2",
        },
      }
    }
  }
}

Option 2 (Indexing):

Posts: {
  post_1: {
    title: "hello",
    body: "hi there!",
    uid: "user_1",
    authorName: "Richard",
    comments: {
      comment_1: true,
      comment_2: true
    }
  }
}

Users: {
  user_1: {
    uid: "user_1",
    displayName: "Richard",
    email: "richard@gmail.com",
    posts: {
      post_1: true
    },
    comments: {
      comment_1: true,
      comment_2: true
    }
  }
}

Comments: {
  comment_1: {
    body: "hi I commented",
    uid: "user_1",
  },
  comment_2: {
    body: "bye I commented",
    uid: "user_1",
  },
}

I think I should prefer option 2.

The main problem that I see with option 1 is that there are too many sources for one data. Let's say I want to extend the app so each post belongs to a certain category or tag. Then, I will have to write a post object under /categories/category_id in addition to /posts and /users/uid. When the post gets updated, I have to remember to modify the post object in three different places. If I go with option 2, I don't have this problem because there's only one source for data.

Am I missing anything?

References:

  1. Firebase data structure and url
  2. https://firebase.google.com/docs/database/web/structure-data
Community
  • 1
  • 1
Maximus S
  • 10,759
  • 19
  • 75
  • 154
  • If you like the second option then why do NoSQL at all? Just go with SQL... – obe Aug 07 '16 at 19:05
  • I want to try firebase for my backend. Also, I want to learn if I should prefer option1 to option 2 in NoSQL. This (https://firebase.google.com/docs/database/web/structure-data) says option 2 is better but a lot of posts also suggest doing option 1. – Maximus S Aug 07 '16 at 19:09
  • Well, disclaimer-wise, I don't have hands-on experience with using NoSQL as the primary back-end. I may be old fashioned but I like having normalized data when possible. It's ok to bend the rules and have some duplicity for substantial performance or simplicity gains but on the most part I think that for complex relational data the best choice is a relational database (possibly with NoSQL as a cache tier). I'm aware that I'm not answering your question, but that's why I'm writing a comment instead of an answer :) – obe Aug 07 '16 at 19:15
  • 2
    The best solution is the one that works for all your use-cases. Since we can't know all your use-cases (and likely you don't know them yet either), nobody has any chance of recommending the best solution. In general though it is more common to have duplicate data in NoSQL databases than in SQL (where it's also not out of the question). To get more comfortable with the topic, I highly recommend reading this article on [NoSQL data modeling](https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/). – Frank van Puffelen Aug 07 '16 at 23:26
  • 2
    I'd also recommend reading my [classic answer investigating a similar data structure in Firebase](http://stackoverflow.com/questions/16638660/firebase-data-structure-and-url/16651115#16651115). Both the article and my answer are not very afraid of duplicating data. When you come from a SQL background that is one of the most common knee-jerk responses you can/have to let go: data duplication is normal in NoSQL and often needed to achieve the required scalability that NoSQL solutions are known for. – Frank van Puffelen Aug 07 '16 at 23:29

0 Answers0