I trying to architect a non repeating stream of content at scale. E.g. When a profile is rated on Hot or Not by a user, that profile is to never appear again to the same user.
In MySQL I you would consider something like a NOT IN (..)
:
SELECT * FROM profiles WHERE id NOT IN (1,2,3,4,5,6,7) LIMIT 1
Each time a user interacts with a profile, it is 'marked as read'. Thus the above query attempts fetch a profile that is hasn't already read. This doesn't scale as the NOT IN
will be infinite.
Next I could join (my SQL might be off, but I hope it gets the idea across):
SELECT profiles.* FROM profiles, read WHERE read.user_id = 1 AND profile.id != read.profile_id LIMIT 1
I feel that this solution isn't ideal either, as the number of read
rows for user_id
will be forever going.
Is this where I need to consider NoSQL with Map Reduce?