-1

I've been trying to learn more about how to design something like a follower/following/like data struct for a service like instagram. I'm having trouble understanding a few basic things:

  • when people say, tell me about the high-level datamodel for instagram, are they asking for ERD or Db schema (besides APIs)? is there a difference between Entity-relationship diagram and mysql schema?

  • Conceptually a user has multiple followers/followees. For followers, a User entity in ERD, can have a "list of follower Ids" as a field. So the "user" entity will have a many-to-one relationship with itself, i.e. a many-to-one arrow would point from the User's followers field to User itself? Is something like this even possible in ERD?

  • How does one specify a "List"/"Array" of follower IDs in actual DBs (as "list/vector/array" seems to be more like C/C++ terminology)?

Joe Black
  • 625
  • 6
  • 19
  • Re bullet 1: You ask about 2 things. If you don't know what they are, why are you asking for the difference between them?--Find out what each is & apply them to your goal. If you think you know what they are, why don't you know "the difference"--and what does that even mean? Either way ask a question where stuck. Re 2: Find a published information modeling & DB design textbook, follow it, and ask a question where stuck. Right now you are just asking us to write one & give a bespoke tutorial. Re 3: See 2 but also this is an easily found faq. PS Please ask one question per post. – philipxy Sep 02 '19 at 07:45
  • In the relational & ER models relation(ship)s/associations are represented by DB tables. ER arrows represent entity participations & are represented by DB FKs. Pseudo-ER models (wrongly) call participations/FKs "relationships". FKs fall out of designs, they don't drive them. PS You mention a relation(ship)/association like "user UID has follower list FL" although in a relational DB we would prefer "user UID has follower FID". [Is storing a delimited list in a database column really that bad?](https://stackoverflow.com/q/3653462/3404097) – philipxy Sep 02 '19 at 07:46
  • @philipxy you can know they're different but may not know exactly how, hence the question. i'm only asking for succint and understandable difference not full tutorial. – Joe Black Sep 02 '19 at 16:39
  • "What is the difference" is just a rhetorical device that sounds at first impression like it isn't asking for what both are, but is. Find out what both are & ask a question where you are stuck in expressing whatever you think "the difference" comes to--although it is extremely unlikely that such a "difference" is relevant to anything. – philipxy Sep 03 '19 at 00:53
  • PS We can't tell you what someone means by some imprecise phrase like "the high-level datamodel for", let alone with a vague "tell me about" added. If you don't understand what someone means, like do they want an ER diagram or whatever, then ask them. If you don't know about the topics they are asking about & couldn't answer if you did know what kind of summary they wanted, research the topics, don't ask us to do your work. If you want an opinion on your own summary, give it & ask that, but ask in a way that is on topic here. See [ask], other [help] links & the voting arrow mouseover texts. – philipxy Sep 03 '19 at 01:10

1 Answers1

0

There is a difference between an ER diagram and a DB schema, but you can convert one into the other. A schema contains database specific information, like where tables are located (schema), how indexes are organized and lot of that stuff. A ERD is at the conceptional layer. And there is no single ER Diagram notation, there are several specified and the tools mix them.

Something about database and recursion

The first recursion you should know is a tree. It is used for directed graphs with a single root. Like a company and its sub companies. This is a so called "fishhook" in an ER Diagram (to be exact: It's a fishhook when it has a direction and an existance depends on the root object).

+----------+
| company  | <---+
+---+------+  *  |
    |1           |
    +------------+
     owns

But what you need is not a tree. A follows relation is a many to many relation with no direction. So we have to change the model:

+----------+
| user     | ----+
+---+------+ *   |
    |*           |
    +------------+
     follows

To archive this, you need a table to store the relations. Let us call the table follower. It will contain the id of the follower and the followed user. For each unique combination there is an entry in the database. A script to create it, (in postgres dialect):

CREATE TABLE public.user (
    id bigint NOT NULL,
    name character varying(100) NOT NULL
); 

CREATE TABLE public.follower(
    id bigint NOT NULL,
    follower_id bigint NOT NULL
);

-- pk user
ALTER TABLE ONLY public.user
    ADD CONSTRAINT user_pkey PRIMARY KEY (id);

-- pk follower
ALTER TABLE ONLY public.follower
    ADD CONSTRAINT follower_pkey PRIMARY KEY (id, follower_id);

-- foreign keys

ALTER TABLE ONLY public.follower
    ADD CONSTRAINT follower_fk1 FOREIGN KEY (id) REFERENCES public.user(id);

ALTER TABLE ONLY public.follower
    ADD CONSTRAINT follower_fk2 FOREIGN KEY (follower_id) REFERENCES public.user(id);

Some impacts of this model

  • If you have a graph like that: bob follows judy, judy follows bob you may end in an endless loop.

  • If you want to find followers of followers ... you have to query several times. There are some vendor specific extensions for recursive queries, but plain ANSI SQL doesn't support this (and so your OR Mapper will no support it out of the box).

My point of view: It is a model, where a relational database doesn't fit well. A graph DB, like neo4j is a better choise for that case.

  • "where a relational database doesn't fit well" That reflects a fundamental lack of understanding of the relational model. "What you need to model is a tree" No, and your graph isn't a tree either. – philipxy Sep 02 '19 at 07:50
  • If you want to traverse the tree, to see the followers of followers and so on... a relational model doesn't fit well. and it is a tree from the view of a single user. – Bernd Böllert Sep 02 '19 at 07:58
  • How is one-to-many relationship to the same entity (fishhook as you mentioned) specified in a relational db like sql? – Joe Black Sep 02 '19 at 16:49