1

I trying to architect a non repeating stream of content at scale. E.g. When a profile is rated on Hot or Not by a user, that profile is to never appear again to the same user.

In MySQL I you would consider something like a NOT IN (..):

SELECT * FROM profiles WHERE id NOT IN (1,2,3,4,5,6,7) LIMIT 1

Each time a user interacts with a profile, it is 'marked as read'. Thus the above query attempts fetch a profile that is hasn't already read. This doesn't scale as the NOT IN will be infinite.

Next I could join (my SQL might be off, but I hope it gets the idea across):

SELECT profiles.* FROM profiles, read WHERE read.user_id = 1 AND profile.id != read.profile_id LIMIT 1

I feel that this solution isn't ideal either, as the number of read rows for user_id will be forever going.

Is this where I need to consider NoSQL with Map Reduce?

Community
  • 1
  • 1
sime
  • 61
  • 1
  • 7
  • How many profiles the most active user can rate? (regarding sql - if "profiles" and "read" might be big for one user I would recommend neither JOIN nor NOT IN) – Multisync Oct 18 '14 at 11:35
  • An unlimited amount of profiles. I would say a really active user would get into the hundreds. Over time it would get into the thousands. – sime Oct 18 '14 at 12:29
  • 1
    With SQL I would use NOT EXISTS then (see here: http://stackoverflow.com/questions/173041/not-in-vs-not-exists). If you only want to select 1 unread profile for a user I would keep track of one unread profile for each user. If user reads a profile the system switches to the next unread profile for this user (if necessary)... Can't say anything about NoSQL. – Multisync Oct 18 '14 at 12:47
  • Cool, can't recall the last time I have used NOT EXISTS. – sime Oct 18 '14 at 16:12

0 Answers0