1

I love seeing that there's a cross-platform standard for TCP/IP sockets emerging for C++ in boost. And so far I've been able to find help for all topics I've run into. But now I'm stuck on an odd behavior. I'm developing using Xcode 7.3.1 on an late-2013 iMac.

I'm developing a simple web server for a special purpose. The code below is a pared down version that demonstrates the bad behavior:

#include <boost/asio.hpp>
#include <boost/bind.hpp>

using namespace std;
using namespace boost;
using namespace boost::asio;
using namespace boost::asio::ip;

int main(int argc, const char * argv[]) {

    static asio::io_service ioService;
    static tcp::acceptor tcpAcceptor(ioService, tcp::endpoint(tcp::v4(), 2080));

    while (true) {

        // creates a socket
        tcp::socket* socket = new tcp::socket(ioService);

        // wait and listen
        tcpAcceptor.accept(*socket);

        asio::streambuf inBuffer;
        istream headerLineStream(&inBuffer);

        char buffer[1];
        asio::read(*socket, asio::buffer(buffer, 1));  // <--- Yuck!

        asio::write(*socket, asio::buffer((string) "HTTP/1.1 200 OK\r\n\r\nYup!"));

        socket->shutdown(asio::ip::tcp::socket::shutdown_both);
        socket->close();
        delete socket;

    }

    return 0;
}

When I access this service, under a certain set of conditions, the browser will choke for upwards of 20 seconds. If I pause the program running in debug mode, I can see that the asio::read() call is blocking. It's literally waiting for even a single character to appear from the browser. Why is this?

Let me clarify, because what I have to do to reproduce this on my machine is strange. Once I start the program (for debugging), I open the "page" from Chrome (as http://localhost:2080/). I can hit Refresh many times and it works just fine. But then I use Firefox (or Safari) and it hangs for maybe 20 seconds, whence the page shows up as expected. Now get this. If, during that delay in Firefox, I hit Refresh in Chrome, the Firefox page shows up immediately, too. In another experiment, I hit Refresh in Chrome (works fine) and then hit Refresh in both Firefox and Safari. Both of them hang. I hit Refresh in Chrome and all 3 show up immediately.

In a change to this experiment, as soon as I start this program, I hit Refresh in either Firefox or Safari and they work just fine. No matter how many times I refresh. And going back and forth between them. I'm literally holding down CMD-R to rapid-fire refresh these browsers. But as soon as I refresh Chrome on the same page and then try refreshing the other two browsers, they hang again.

Having done web programming since around 1993, I know the HTTP standard well. The most basic workflow is that the browser initiates a TCP connection. As soon as the web server accepts the connection, the client sends an HTTP header. Something like "GET /\r\n\r\n" for the root page ("/"). The server typically reads all the header lines and stops until it gets to the first blank line, which signals the end of the headers and beginning of the uploaded content (e.g., POSTed form content), which the web application is free to consume or ignore. The server responds when it is ready with its own HTTP headers, starting typically with "HTTP/1.1 200 OK\r\n", followed by the actual page content (or binary file contents, etc).

In my app, I'm actually using asio::read_until(*socket, inBuffer, "\r\n\r\n") to read the entire HTTP header. Since that was hanging, I thought maybe those other browsers were sending corrupt headers or something. Hence my trimming down of the sample to just reading a single character (should be the "G" in "GET /"). One single character. Nope.

As a side note, I know I'm doing this synchronously, but I really wanted a simple, linear demo to show this bad behavior. I'm assuming that's not what's causing this problem, but I know it's possible.

Any thoughts here? In my use case, this is sufferable, since the server does eventually respond, but I'd really rather understand eliminate this bad behavior.

Jim Carnicelli
  • 179
  • 1
  • 7
  • Not able to reproduce it on my mac – Arunmu Aug 17 '16 at 16:53
  • Studying more, I find that Chrome is making two connection requests and the other browsers make just one. The first comes with an HTTP header but the second does not. It appears this is a bug in Chrome. The following post relates: http://stackoverflow.com/questions/4761913/server-socket-receives-2-http-requests-when-i-send-from-chrome-and-receives-one – Jim Carnicelli Aug 17 '16 at 17:22
  • I did notice the two requests from Chrome, one for '/' and another for 'favicon.ico' and both requests appeared to be fine. Maybe you are using an old version of chrome ? – Arunmu Aug 17 '16 at 17:32
  • Yes, Chrome will separately try to fetch /favicon.ico, but that's unrelated to this. – Jim Carnicelli Aug 17 '16 at 20:12

1 Answers1

0

It seems this results from a design quirk in Chrome. See this post:

server socket receives 2 http requests when I send from chrome and receives one when I send from firefox

I see what's happening now. Chrome makes 2 connection requests. The first is for the desired page and contains a proper request HTTP header. The second connection, once accepted, does not contain even a single byte of input data. So my attempt to read that first byte goes unrewarded. Fortunately, the read attempt times out. That's easy enough to recover from with a try/catch.

This appears to be a greedy optimization to speed up Chrome's performance. That is, it holds the next connection open until the browser needs something from the site, whence it sends the HTTP request on that open socket. It then immediately opens a new connection, again in anticipation of a future request. Although I get how this speeds Chrome's experience up, this seems a dubious design because of the added burden it places on the server.

This is a good argument for opening a separate thread to handle each accepted socket. A thread could patiently hang out waiting for the never-forthcoming request while other threads handle other requests. To that end, I wrapped up everything after tcpAcceptor.accept(*socket); in a new thread so the loop can continue waiting for the next request.

Community
  • 1
  • 1
Jim Carnicelli
  • 179
  • 1
  • 7
  • 1
    The whole idea of ASIO is to avoid the necessity of creating new threads. You just should not read unconditionally. You also should not create a socket every time. And not using new anyway, it sould just be a local variable. – Ilya Popov Aug 17 '16 at 18:18
  • 1
    And static is also not needed here. – Ilya Popov Aug 17 '16 at 18:19
  • The "static" is an oopsie here. Thanks. – Jim Carnicelli Aug 17 '16 at 20:18
  • As far as I can tell, even ASIO's asynchronous methods entail in-thread invocation of your callbacks, which means that one request will be handled at a time if you have only one thread. I haven't tested to confirm this yet, but I will. So if one request takes a long time and another comes in after the first, it will sit and wait until the first is done. Thus, I don't think ASIO obviates the need for multithreading. http://www.boost.org/doc/libs/1_41_0/doc/html/boost_asio/overview/core/threads.html – Jim Carnicelli Aug 17 '16 at 20:22
  • 1
    @IlyaPopov " The whole idea of ASIO is to avoid the necessity of creating new threads". Certainly not. It kind of does provide you with an abstraction (bare minimal) on top of threads. But you gotta use threads when you have to (as per design) and will certainly be the case when you have to handle different types of asynchronous tasks like timer events, I/O events, blocking events etc. – Arunmu Aug 17 '16 at 20:39
  • @Arunmu With ASIO, you don't *necessarily* need to create a thread per connection. Of course, there are cases then you do want to create threads, but this will be motivated by other things. – Ilya Popov Aug 17 '16 at 20:45
  • @IlyaPopov Yeah, the requirement of thread per connection is a different story. My comment was strictly based on your first sentence of your first comment, the context of which was not clear until I reread the answer :) – Arunmu Aug 17 '16 at 20:49