I need to build an activity feed (stream? A "lifestream" to be more accurate.) for a system similar (same) in resemblance to many popular social networking platforms. My initial attempt was to use an RDBMS but quickly dropped the idea due to the vast amounts of JOINs needed. Scavenging for other possible (and better-suited) approaches, I stumbled upon the following post:
How do social networking websites compute friend updates?
Taking the advise to make use of a message queue, I have spent some time studying RabbitMQ and its PubSubHubbub protocol. And I postulated the following approach:
1) Each user has a "topic"
2) Other users subscribe to the topic
3) When the user performs some action, a message is published which is then related (References resolved), formatted (Human-friendly language, links, etc.) and aggregated (X, Y and Z have commented on post P) with a PHP-script.
However, I would still have to go through each message and process it (unless my approach is completely wrong). So, what would the difference be between storing everything in a RDBMS and using a message queue (other than the implementation of the PubSubHubbub protocol)?
Are there more efficient ways to build such a system? (If so, please specify)
Comments / Suggestions / Criticisms are welcome. :)
Thank you in advance!
P.S.: There is an interesting article on how FriendFeed implements it ( http://bret.appspot.com/entry/how-friendfeed-uses-mysql ). However, I feel the "hackery" pushes MySQL out of it's comfortable domain (which is simply Relational Data and what would be the point of using an RDBMS without relational data?)
P.P.S.: Another issue using a message queue that I see (perhaps, due to me being new to this technology) is that once the message is fetched by the "Consumer", it is removed from the queue, however, I want it to persist for an arbitrary amount of time.