To preface, I'm no DBA or SQL expert. But I've taken on a personal project that requires me to wear all hats in making a social network. (No, I'm not trying to reinvent Facebook. I'm targeting a niche audience.) And yes, I've heard of frameworks such as http://activitystrea.ms/, but I feel like data serialization should be a last resort for my needs.
Anyway, How to implement the activity stream in a social network helped me get the ball rolling, but I have some unanswered questions.
Below is my database schema (some rows have been omitted for simplification):
Action table:
id name
-------------
1 post
2 like
3 follow
4 favorite
5 tag
6 share
Activity table:
id (int)
user_id (int)
action_id (tinyint)
target_id (int)
object_id (tinyint)
date_created (datetime)
The object_id
refers to which object type the target_id
is. The idea here is to represent (User + Action + Target Object)
- User Post(s) Media
- User Favorite(s) Scene
- User Follow(s) User
Object (type) table:
id name
-------------
1 media
2 scene
3 brand
4 event
5 user
The problem here is that each object
has its own separate table. For example:
Media table:
id (int)
type (int)
thumbnail (varchar)
source (varchar)
description (varchar)
Event table:
id (int)
user_id (int)
name (varchar)
city (int)
address (varchar)
starts (time)
ends (time)
about (varchar)
User table:
id (int)
username (varchar)
profile_picture (varchar)
location (int)
What, then, would be the best (i.e., most efficient) way of querying this database?
Obviously I could perform a SELECT
statement on the activity
table, and then – based on the object_id
– use conditional logic in PHP to make a separate query to the appropriate object
's table (e.g., media
).
Or would it be smarter (and more efficient) to implement some sort of left or inner JOIN
on all 5 object
tables, as suggested here: MySQL if statement conditional join. I'm not entirely familiar with how JOINS
work, and whether SQL is smart enough to only scan the appropriate object
table for each activity
row, rather than ALL the joined tables.
Of course the first solution means MANY more calls to the database, which is less desirable. However, I'm not sure how else I could retrieve all the relevant columns (e.g., media
"source", event
"address") in just one query without implementing some conditional logic.