Where I'm At
I have a simple node.js twitter stream consumer that tracks various hashtags. Oftentimes, these are trending hashtags, which means a high-volume of twitter json is streaming into my consumer. I don't do any processing of the twitter json in the consumer.
What I Want
I want to store the tweet json objects in rethinkdb.
Assumptions
Due to the volume (and unpredictability of said volume) of tweets, I should avoid inserting the tweet json objects into rethinkdb as they are consumed (since the rate at which the tweets enter the consumer might be faster than the rate at which rethinkdb can write those tweets).
Since Redis is definitely fast enough to handle the writes of the tweet json objects as they are consumed, I can push the tweet json objects directly to redis and have another process pull those tweets out and insert them into rethinkdb.
What I Hope To Learn
- Are my assumptions correct?
Does this architecture make sense? If not, can you suggest a better alternative?
If my assumptions are correct and this architecture makes sense,
a. What is the best way of using redis as a buffer for the tweets?
b. What is the best way of reading from (and updating/clearing) the redis buffer in order to perform the inserts into rethinkdb?