280

I have read some posts about this topic and the answers are comet, reverse ajax, http streaming, server push, etc.

How does incoming mail notification on Gmail works?

How is GMail Chat able to make AJAX requests without client interaction?

I would like to know if there are any code references that I can follow to write a very simple example. Many posts or websites just talk about the technology. It is hard to find a complete sample code. Also, it seems many methods can be used to implement the comet, e.g. Hidden IFrame, XMLHttpRequest. In my opinion, using XMLHttpRequest is a better choice. What do you think of the pros and cons of different methods? Which one does Gmail use?

I know it needs to do it both in server side and client side. Is there any PHP and Javascript sample code?

Community
  • 1
  • 1
Billy
  • 15,516
  • 28
  • 70
  • 101

5 Answers5

441

The way Facebook does this is pretty interesting.

A common method of doing such notifications is to poll a script on the server (using AJAX) on a given interval (perhaps every few seconds), to check if something has happened. However, this can be pretty network intensive, and you often make pointless requests, because nothing has happened.

The way Facebook does it is using the comet approach, rather than polling on an interval, as soon as one poll completes, it issues another one. However, each request to the script on the server has an extremely long timeout, and the server only responds to the request once something has happened. You can see this happening if you bring up Firebug's Console tab while on Facebook, with requests to a script possibly taking minutes. It is quite ingenious really, since this method cuts down immediately on both the number of requests, and how often you have to send them. You effectively now have an event framework that allows the server to 'fire' events.

Behind this, in terms of the actual content returned from those polls, it's a JSON response, with what appears to be a list of events, and info about them. It's minified though, so is a bit hard to read.

In terms of the actual technology, AJAX is the way to go here, because you can control request timeouts, and many other things. I'd recommend (Stack overflow cliche here) using jQuery to do the AJAX, it'll take a lot of the cross-compability problems away. In terms of PHP, you could simply poll an event log database table in your PHP script, and only return to the client when something happens? There are, I expect, many ways of implementing this.

Implementing:

Server Side:

There appear to be a few implementations of comet libraries in PHP, but to be honest, it really is very simple, something perhaps like the following pseudocode:

while(!has_event_happened()) {
   sleep(5);
}

echo json_encode(get_events());
  • The has_event_happened function would just check if anything had happened in an events table or something, and then the get_events function would return a list of the new rows in the table? Depends on the context of the problem really.

  • Don't forget to change your PHP max execution time, otherwise it will timeout early!

Client Side:

Take a look at the jQuery plugin for doing Comet interaction:

That said, the plugin seems to add a fair bit of complexity, it really is very simple on the client, perhaps (with jQuery) something like:

function doPoll() {
   $.get("events.php", {}, function(result) {
      $.each(result.events, function(event) { //iterate over the events
          //do something with your event
      });
      doPoll(); 
      //this effectively causes the poll to run again as
      //soon as the response comes back
   }, 'json'); 
}

$(document).ready(function() {
    $.ajaxSetup({
       timeout: 1000*60//set a global AJAX timeout of a minute
    });
    doPoll(); // do the first poll
});

The whole thing depends a lot on how your existing architecture is put together.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Alistair Evans
  • 36,057
  • 7
  • 42
  • 54
  • 2
    It's a very nice and detailed explanation. Thank you. Do you have any sample code for one of the many ways to implement that? – Billy Jul 06 '09 at 15:05
  • Take a look at the edits just added, some sample source code, may help. – Alistair Evans Jul 06 '09 at 16:22
  • Good explanation. But, I don't believe PHP is right choice in server side. Framework&language supporting asynchrony should be used to have good scalability. – Morgan Cheng Mar 19 '10 at 08:52
  • 45
    I think labelling PHP as a language/platform that does not scale well is not necessarily true. It can be used to develop extremely large scale systems. Look at facebook. If the developer does it right, then it will scale, if not, then it won't. Using a specific web platform isn't a guarantee of scalability. Oh, and also, the question did ask for PHP. – Alistair Evans Mar 19 '10 at 10:30
  • 5
    @Kazar: "Facebook uses PHP" is a bit misleading -- last i heard, they developed HipHop for the express purpose of converting PHP to C++, as PHP wasn't performing well enough. – cHao Oct 16 '11 at 07:14
  • 14
    @cHao: It's a fair point, however this answer was written in 2009, before facebook started using hiphop. At the time facebook was still a very large-scaled system using php on it's own. – Alistair Evans Oct 16 '11 at 09:35
  • @Kazar: Except when they announced HipHop around Feb 2010 (read: before your "Look at facebook" comment above, which is what i was actually replying to), [they'd already been working on it for two years](http://developers.facebook.com/blog/post/358/). Even back then they realized PHP on its own wouldn't cut it. – cHao Oct 16 '11 at 09:46
  • The name this technique has is called "Long Polling". Also on the server side I have found Jetty Servlet Server to be really stable for this kind of polling since it suports Java NIO which supports nonblocking IO. Tomcat seems to be a good solution as well. There's also a book about Comet called "comet and reverse ajax the next-generation ajax 2.0" – Wifi Cordon Dec 11 '12 at 19:28
  • I'd also checkout www.meteor.com – rickyduck Apr 23 '13 at 15:55
  • 7
    So the technique is to keep a connection constantly open which will keep a server in a constant stress. A typical amount of concurrent connections for an average web-server is about 200, but the number of Facebook users which are online simultaneously is a way bigger. How do they do that? – Paul Apr 07 '14 at 12:04
  • 2
    @Paul - The first thing to consider is that actually keeping a connection open doesn't necessarily keep a server under constant stress. I imagine the Facebook servers park the pending request from the client somewhere, then forgets about until a thread wakes up, grabs the pending request and completes it. So, high memory consumption, but not necessarily high CPU. The only real processing overhead would be in any TCP keep-alives needed for such a connection, which would be handled at a network level. – Alistair Evans Apr 07 '14 at 13:54
  • 1
    The second thing to consider is that facebook has a lot of servers to spread this across; and I mean a lot, an estimate I just googled put the count at 180,000. – Alistair Evans Apr 07 '14 at 13:55
  • Any knowledge on how this approach could be implemented in Django? Since it loads a predefined number of request at a time this approach may block the server sending responses to other services. Any help or thought really appriciated – rordulu Jul 02 '14 at 14:32
  • 1
    Not directly related to the question, but I notice that your client-side example code uses recursion. With each new poll recursing deeper, will Javascript freak out at some point? Say, if someone stays logged in for a very long time and receives thousands of notifications? – alexw Apr 26 '15 at 15:22
  • 2
    @alexw - The reason it won't freak out is that it isn't actually recursion (in the normal sense) - the subsequent calls to doPoll are in a callback function that will be run at some point later after the server replies with a response (assuming the ajax settings are not set to synchronous, which they should never be). Another way of describing it is that if you put a breakpoint on the doPoll function in the callback, you should never see another doPoll entry above it. – Alistair Evans Apr 27 '15 at 10:12
  • it isn't scary recursion because the depth of recursion never exceeds 1. – Raja Nadar Jan 12 '17 at 23:55
  • But I'm guessing the server now will have to keep pooling for the event right? The code `while(!event()) {sleep(5);}` is still going to either sleep for a long time, beating the whole purpose of "instant" notification, or keep looping a lot, eating up the processing power of the server. – arunwithasmile Mar 23 '22 at 16:38
48

Update

As I continue to recieve upvotes on this, I think it is reasonable to remember that this answer is 4 years old. Web has grown in a really fast pace, so please be mindful about this answer.


I had the same issue recently and researched about the subject.

The solution given is called long polling, and to correctly use it you must be sure that your AJAX request has a "large" timeout and to always make this request after the current ends (timeout, error or success).

Long Polling - Client

Here, to keep code short, I will use jQuery:

function pollTask() { 

    $.ajax({

        url: '/api/Polling',
        async: true,            // by default, it's async, but...
        dataType: 'json',       // or the dataType you are working with
        timeout: 10000,          // IMPORTANT! this is a 10 seconds timeout
        cache: false

    }).done(function (eventList) {  

       // Handle your data here
       var data;
       for (var eventName in eventList) {

            data = eventList[eventName];
            dispatcher.handle(eventName, data); // handle the `eventName` with `data`

       }

    }).always(pollTask);

}

It is important to remember that (from jQuery docs):

In jQuery 1.4.x and below, the XMLHttpRequest object will be in an invalid state if the request times out; accessing any object members may throw an exception. In Firefox 3.0+ only, script and JSONP requests cannot be cancelled by a timeout; the script will run even if it arrives after the timeout period.

Long Polling - Server

It is not in any specific language, but it would be something like this:

function handleRequest () {  

     while (!anythingHappened() || hasTimedOut()) { sleep(2); }

     return events();

} 

Here, hasTimedOut will make sure your code does not wait forever, and anythingHappened, will check if any event happend. The sleep is for releasing your thread to do other stuff while nothing happens. The events will return a dictionary of events (or any other data structure you may prefer) in JSON format (or any other you prefer).

It surely solves the problem, but, if you are concerned about scalability and perfomance as I was when researching, you might consider another solution I found.

Solution

Use sockets!

On client side, to avoid any compatibility issues, use socket.io. It tries to use socket directly, and have fallbacks to other solutions when sockets are not available.

On server side, create a server using NodeJS (example here). The client will subscribe to this channel (observer) created with the server. Whenever a notification has to be sent, it is published in this channel and the subscriptor (client) gets notified.

If you don't like this solution, try APE (Ajax Push Engine).

Hope I helped.

Sachin Joseph
  • 18,928
  • 4
  • 42
  • 62
Walter Macambira
  • 2,574
  • 19
  • 28
  • do you think 1 is a replacement for the other or is there a need for both technologies on the same project? – t q May 07 '13 at 15:39
  • If you mean APE and NodeJS, you can choose one of them. if you mean periodic AJAX requests and the one i suggested, my solution may fallback to the ajax one when lacks socket support (refer to socket.io docs). In both cases, you need only one solution. – Walter Macambira May 08 '13 at 01:09
  • Hey Walter, I would like to use your suggestion on one of my sites. Do you know where I can get a Sockets server? Thanks! – Progo Apr 28 '14 at 01:11
  • 1
    You can implement it. Node makes it really simple. – Walter Macambira Apr 28 '14 at 15:37
  • How to detect `hasTimedOut()` ? – Mobasher Fasihy Jul 06 '15 at 07:22
  • @MobasherFasihy as you have a threads.sleep right over here, just start a timer before entering de loop, and check if it has elapsed your timeout value. – Walter Macambira Jul 06 '15 at 12:41
  • @Walter Socket solution seems to make more sense. Would have been great if a in-action example was given. For example what happens when a person opens(initiates) a chatbox with a friend? How facebook tune on to this specific conversation and pushes messages to both end? (just a guess: I can only imagine that the application program opens a socket and bind both client addresses and then just keep listening and writing whenever message is written in the box) – edam Jul 23 '17 at 19:02
  • @edam Like I said in the edit, this is an old answer and probably not the most appropriate anymore. – Walter Macambira Jul 24 '17 at 12:54
  • So, you mean Facebook is using long polling now? I saw this article, i wonder is this the current implementation by FB. https://medium.com/javarevisited/building-scalable-facebook-like-notification-using-server-sent-event-and-redis-9d0944dee618 – Neo Jan 21 '21 at 09:32
  • I have no idea how they implement it in 2021. My answer is 8 years old. The OP only needed to understand the mechanics of "real-time" updates, and 8 years ago it was a reasonable solution. Today, for one-direction communication, I would use SSE. – Walter Macambira Jan 21 '21 at 11:54
19

According to a slideshow about Facebook's Messaging system, Facebook uses the comet technology to "push" message to web browsers. Facebook's comet server is built on the open sourced Erlang web server mochiweb.

In the picture below, the phrase "channel clusters" means "comet servers".

System overview

Many other big web sites build their own comet server, because there are differences between every company's need. But build your own comet server on a open source comet server is a good approach.

You can try icomet, a C1000K C++ comet server built with libevent. icomet also provides a JavaScript library, it is easy to use as simple as:

var comet = new iComet({
    sign_url: 'http://' + app_host + '/sign?obj=' + obj,
    sub_url: 'http://' + icomet_host + '/sub',
    callback: function(msg){
        // on server push
        alert(msg.content);
    }
});

icomet supports a wide range of Browsers and OSes, including Safari(iOS, Mac), IEs(Windows), Firefox, Chrome, etc.

k0pernikus
  • 60,309
  • 67
  • 216
  • 347
ideawu
  • 2,287
  • 1
  • 23
  • 28
  • This image describes the scenario very well. Would have been great if a in-action example was given. For example what happens when a person opens(initiates) a chatbox with a friend? How facebook tune on to this specific conversation and pushes messages to both end? (just a guess: I can only imagine that the application program opens a socket and bind both client addresses and then just keep listening and writing whenever message is written in the box) – edam Jul 23 '17 at 19:01
5

Facebook uses MQTT instead of HTTP. Push is better than polling. Through HTTP we need to poll the server continuously but via MQTT server pushes the message to clients.

Comparision between MQTT and HTTP: http://www.youtube.com/watch?v=-KNPXPmx88E

Note: my answers best fits for mobile devices.

abhi
  • 1,412
  • 19
  • 25
  • 3
    Additionally, google uses GCM service for android, it can be used by developers for implementing push message service. http://developer.android.com/google/gcm/index.html Please accept if you find the answer useful. – abhi Jul 20 '13 at 15:53
5

One important issue with long polling is error handling. There are two types of errors:

  1. The request might timeout in which case the client should reestablish the connection immediately. This is a normal event in long polling when no messages have arrived.

  2. A network error or an execution error. This is an actual error which the client should gracefully accept and wait for the server to come back on-line.

The main issue is that if your error handler reestablishes the connection immediately also for a type 2 error, the clients would DOS the server.

Both answers with code sample miss this.

function longPoll() { 
        var shouldDelay = false;

        $.ajax({
            url: 'poll.php',
            async: true,            // by default, it's async, but...
            dataType: 'json',       // or the dataType you are working with
            timeout: 10000,          // IMPORTANT! this is a 10 seconds timeout
            cache: false

        }).done(function (data, textStatus, jqXHR) {
             // do something with data...

        }).fail(function (jqXHR, textStatus, errorThrown ) {
            shouldDelay = textStatus !== "timeout";

        }).always(function() {
            // in case of network error. throttle otherwise we DOS ourselves. If it was a timeout, its normal operation. go again.
            var delay = shouldDelay ? 10000: 0;
            window.setTimeout(longPoll, delay);
        });
}
longPoll(); //fire first handler
Ronenz
  • 2,048
  • 2
  • 16
  • 8