0

I am trying to figure out what the most appropriate table structure would be for related videos. I have a table videos which contains details about all the videos. I am planning on a new feature of video recommendations which I will be creating manually for now.

I was thinking about storing it in a table column inside the videos table videos.relatedVideos and store JSON data for related videos ID. However, I want it to be backward compatible as well.

So say if a video with ID = 5 is added as a related video for video with ID = 10, the table structure would be

   | id | videoId | ..... | relatedVideos|

   | 1  | 10      | ..... | [5]          |
   | 2  | 4       | ..... | [3,6,7]      |

I also want the video with ID = 10 to act as a related video for video with ID = 5. One way of doing it would be to create a new entry for #5 on the videos table when I am creating an entry for #10

   | id | videoId | ..... | relatedVideos|

   | 1  | 10      | ..... | [5]          |
   | 2  | 4       | ..... | [3,6,7]      |
   | 3  | 5       | ..... | [10]         |

But I am looking for something more proper, where I can draw many-to-many relationship between the videos.

One option would be to create a table relatedVideos and create new entry for each and every related video with relation with the master video, but it might result into duplicate records when adding/editing related videos for any particular videos.

|id | videoID | relatedVideoID|
|1  | 5       | 10            |
|2  | 5       | 15            |
|3  | 5       | 13            |
|4  | 13      | 19            |

Right now, what I am doing is, if I am browsing video page #13 I am executing two queries (Select relatedVideoID from relatedVideos where videoID = 13) and (Select videoID from relatedVideos where relatedVideoID = 13). I believe I can do a similar thing with SQL joins or unions, I am yet to explore those options.

I am then merging the two array results to get [19, 5] (I am only keeping the distinct ID's). This is getting the work done, however, I am really looking forward to implement a more proper solution.

Just looking for some recommendation on what a good approach would be.

prakashchhetri
  • 1,806
  • 4
  • 36
  • 61
  • Read "Database Normalization" like here : https://www.guru99.com/database-normalization.html – FanoFN Jan 08 '20 at 00:43
  • Which MySQL version?. I would use a [many to many](https://stackoverflow.com/questions/22505938/many-to-many-relationship) table to store the related references (PRIMARY KEY: `video_id`, `related_id`). Instead of duplicating the data use a query to perform a reverse relationship lookup .In MySQL 8.0 you can use a [CTE](https://en.wikipedia.org/wiki/Hierarchical_and_recursive_queries_in_SQL#Common_table_expression) to retrieve all relations from a single video, in earlier versions you would need to use a [transitive closure](https://en.wikipedia.org/wiki/Transitive_closure) schema. – Will B. Jan 08 '20 at 01:04

1 Answers1

0

I would recommend going for the last listed option, that is to set up a bridge table to store the N-M relationship between videos.

Generally speaking, relational databases are not good at managing JSON or CSV data (although some of them have nice vendor features). Basic SQL queries quickly become much harder to write that they should, and performance also suffers (here is a famous SO post about storing CSV in database columns).

The structure of the additional data that you need to store is simple and well-defined, and corresponds to a basic database normalization pattern, so there really is no reason to follow the JSON or CSV paths.

Side note: to maintain data integrity and avoid duplicates, you need to ensure that you are not storing pairs like (1, 2), (2, 1). If you are running MySQL 8.0, you can use a check constraint for this. The DDL of your table could look like:

create table related_videos (
    video_id1 int,
    video_id2 int,
    check(video_id1 < video_id2),
    primary key(video_id1, video_id2),
    constraint fk_related_video1 foreign key (video_id1) references video(id),
    constraint fk_related_video2 foreign key (video_id2) references video(id)
);
GMB
  • 216,147
  • 25
  • 84
  • 135