6

I am playing around with building a chat application using PHP and CodeIgniter.

For this, I am implementing a cache 'buffer' with memcached to hold the most recent chat messages in memory, reducing load on the database. What I want to do is this:

  1. When a message arrives, I save it in memcached using the current minute (YYYY-MM-DD-HH-MM) as the key. No database I/O involved. The idea being that all messages from the same minute are collected under the same key.
  2. Users receive new chat messages also fetched from memcached (for now I'm using long-polling, but this will move to WebSockets under Node.js for obvious performance reasons). Again, no database I/O involved.
  3. An automated server script (cronjob) will run once every 5 minutes, collecting the memcached data from the last 5 minutes and inserting the messages into the database.
  4. The memcached objects are set to go stale after 6 minutes, so we never need to keep more than 6 minutes worth of message data in memory

This for a total of one database write operation per 5 minutes and zero database read operations.

Does this sound feasible? Is there a better (maybe even built-in?) way to use memcached for this purpose?


Update: I have been experimenting a little now, and I have an idea for a shortcut (read: hack). I can 'buffer' the messages temporarily in the Node.js server script until I'm ready to store them. A Javascript object/array of messages in the Node.js server is basically a memory cache - kind of.

So: Every N messages/seconds, I can pass the buffered messages (the contents of the JS array) to my database, using whatever method I want, since it won't be called very often.

However, I'm worried this might cripple the Node.js server process, since it probably won't enjoy carrying around that 200 KB array.

Any thoughts on this strategy? Is it completely crazy?

Jens Roland
  • 27,450
  • 14
  • 82
  • 104
  • Technologies to look into: COMET & erlang/jabber (Facebook use these) OR: HTML5 & Web Sockets (with a flash socket fallback - Google for these) – Kieran Allen Jul 14 '11 at 19:52

2 Answers2

3

Have you looked into HTML5 socket connections? With a socket server, you do not need to store anything. The server receives a message from one subscriber, and immediately sends it back out to the correct subscribers. I have not done this myself using HTML5, but I know the functionality now exists. I have done this before using Flash which also supports socket conenctions.

dqhendricks
  • 19,030
  • 11
  • 50
  • 83
  • 1
    WebSockets are the goal, sure, but they are not available in most browsers yet, so for now I'll be using Socket.IO (which uses WebSockets when available but has several fallback mechanisms) – Jens Roland Jun 14 '11 at 21:31
2

Why don't use INSERT DELAYED ? It offers you almost the same functionality you are trying to achieve without the need of memcached.

Anyway your solution looks good, too.

dynamic
  • 46,985
  • 55
  • 154
  • 231
  • Interesting idea. I don't really like the idea of it, since it seems like force feeding the SQL server hoping it will remember to chew properly (scaled to 100 inserts per second), but it might actually work – Jens Roland Jun 14 '11 at 21:41
  • @Jens: that's not an hope. INSERT DELAYED are made specifically for this case. – dynamic Jun 14 '11 at 21:42
  • But if my node.js instance receives 100 messages/sec and it needs to issue an INSERT DELAYED statement to the SQL server for each one, won't it have to establish 100 connections to the database server? Unless I can do connection pooling on top of that using node.dbslayer.js. – Jens Roland Jun 14 '11 at 22:10
  • 1
    @Jens: well connections are inevitable, but you need to connect to memcache server too. – dynamic Jun 14 '11 at 22:11
  • True, but if I'm not mistaken, my Node.js instance can keep an open connection to the memcached server/process. – Jens Roland Jun 15 '11 at 18:07
  • @Jens: it can keep it open for mysql too? – dynamic Jun 15 '11 at 18:08
  • Node and MySQL are not best friends like that.. at least not yet. The DB libraries for Node aren't fully mature yet, so we have to use whatever we can find, and node.dbslayer.js is apparently the best available – Jens Roland Jun 15 '11 at 18:22