How can I buffer the twitter stream with redis before inserting into rethinkdb?

Question

Where I'm At

I have a simple node.js twitter stream consumer that tracks various hashtags. Oftentimes, these are trending hashtags, which means a high-volume of twitter json is streaming into my consumer. I don't do any processing of the twitter json in the consumer.

What I Want

I want to store the tweet json objects in rethinkdb.

Assumptions

Due to the volume (and unpredictability of said volume) of tweets, I should avoid inserting the tweet json objects into rethinkdb as they are consumed (since the rate at which the tweets enter the consumer might be faster than the rate at which rethinkdb can write those tweets).

Since Redis is definitely fast enough to handle the writes of the tweet json objects as they are consumed, I can push the tweet json objects directly to redis and have another process pull those tweets out and insert them into rethinkdb.

What I Hope To Learn

Are my assumptions correct?
Does this architecture make sense? If not, can you suggest a better alternative?
If my assumptions are correct and this architecture makes sense,

a. What is the best way of using redis as a buffer for the tweets?

b. What is the best way of reading from (and updating/clearing) the redis buffer in order to perform the inserts into rethinkdb?

score 2 · Accepted Answer · edited May 23 '17 at 12:22

We do use this kind of architecture in our production. If the amount of data that you are going to handle doesn't exceeds the max memory limit of redis you can proceed this way. And also you need to take care of downtime.

What is the best way of using redis as a buffer for the tweets?

You can use a redis queue. Where you producer keeps pushing into the head. And your consumer consumes from the tail and populates to your db.

http://redis.io/commands#list

You can use this solution Redis Pop list item By numbers of items as you have a similar requirement (producer is heavy and consumer needs to consume little quicker than popping one by one)

How can I buffer the twitter stream with redis before inserting into rethinkdb?

Where I'm At

What I Want

Assumptions

What I Hope To Learn

1 Answers1