9

I am trying to model a DB in Postgresql after a social media platform such as twitter and Instagram.

I have the following requirements:

  1. A user can create a Post
  2. A User can Like a Post
  3. A user can comment on a Post
  4. A user can comment on another users comment (Reply to a comment)
  5. A user can like a comment

Now I am aware that we can have deeply nested comments if users keep replying to other users in the form of comments. I came up with a a few self-referencing tables that all inherit from a common Entity table. Here is my implemntation so far:

enter image description here

^^ I am aware that it is possible to "like" another persons "Like" or comment on a "Like" which is not a requirement. To prevent this I am thinking I can enforce these constraints at the application code level. Its always better to leave options open incase we might want to implement them in the future right?

My Question is, is this a good DB implementation? Are there use cases and pitfalls I might run into that I am not seeing? Does the design fit the use cases?

john
  • 1,057
  • 1
  • 17
  • 28

2 Answers2

5

Your basic schema structure is probably usable for the basic use cases you were mentioning. I am just missing a connection between comment and post (which comment belongs to which post). You could also argue that comment and post are the same type of object, just distinguished by the relation to another post they have (or not have). Also - more importantly:

Nowadays you should consider using a graph database for modeling the social media domain. As you can see in your schema most of the data are links between the tables and relational databases are actually not the best at highly linked data. This is due to the fact that SQL queries will likely end up including a lot of joins which can become a performance problem once your graph reaches a certain size and depth.

It is also possible to combine the use of a relational database (or nosqldb) with a graph database, in which you only model the link network inside the graph and more table oriented data in the regular database.

For a popular example of a graph database see this.

For more information on why graph databases are better than relational databases at modeling graphs, read this and this.

Oswin Noetzelmann
  • 9,166
  • 1
  • 33
  • 46
  • I use nearly the same type of design now. There are things that a user can react and comment on (comments are reactable either). But instead of putting the "can has reactions" and "can has comments" features to a single entity table, I've divided them. For every feature I create a supertype, e.g. "Reactable", "Commentable" etc. Thus, I can not make the subtype PK's to reference the supertype PK's because there may be more than one supertype for a subtype. Instead, I hold another column for each feature, e.g. "commentableId", "reactableId". Would you have any recommendations on this? – Onur Önder Feb 22 '20 at 22:16
  • 2
    @Onur Onder: My first advice is to keep things as simple as possible in DB design- Don't apply complex object oriented thinking to it. I usually design tables for a specific functionality need, as small as possible and agnostic to the rest of the system. And by using UUIDs for PKs and references I don't have to be restricted to referencing just a one specific type of target type/table in a column. – Oswin Noetzelmann Feb 23 '20 at 02:50
  • I'm simply trying to avoid too much repetition, but not trying to create some too complex system while doing it. Both code duplication or complexity ends up as an unmaintainable structure. Using UUIDs is a great idea. I already follow that practive and it gives a lot of freedom. I will try to keep things more clear and simple. It is a hard thing to balance sometimes. But will try it. Thanks for your advice by the way! – Onur Önder Feb 23 '20 at 03:13
0

Current implementation is good enough. Points to consider:

  • Separation of post's likes and comment's likes may lead to more clarity in development.
  • Consider deeply nested comments for performance issues in relational database (it may require recursion and or multiple joins). You may consider to limit the comments to depth of 2 (example of Facebook and Youtube).
  • More attributes: user password, entity's creation date, etc.
josephnw
  • 1
  • 1
  • Could you elaborate on your 1st point? do you mean to seperate it out so developers will have a clearer understanding fo what they are doing? – john Jan 11 '18 at 00:51
  • Yes, separate "PostLike" and "CommentLike" will make it clear e.g. if we need to analyze likes by post / comment the query will be simpler – josephnw Jan 11 '18 at 01:58