6

I am using HttpURLConnection along the lines of the following:

String strURL = "https://example.herokuapp.com";
Bitmap bmImage = null;
HttpURLConnection connection = null;
InputStream in = null;
showMessage(context.getString(R.string.message_preparing));
try {
    int timeoutMS = 15000;
    URL url = new URL(strURL);
    connection = (HttpURLConnection) url.openConnection();
    connection.setDoInput(true);
    connection.setConnectTimeout(timeoutMS);
    connection.setReadTimeout(timeoutMS);
    connection.connect();
    in = connection.getInputStream();
    BitmapFactory.Options options = new BitmapFactory.Options();
    bmImage = BitmapFactory.decodeStream(in, null, options);
} catch (Exception e) {
    e.printStackTrace();
} finally {
    if (connection != null)
        connection.disconnect();
    if (in != null) {
        try {
            in.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

return bmImage;

This works just fine, with the url defined by strURL returning a bmp image, and this being decoded ready for use by the above code.

But for one user in particular, although the code works fine to fetch the bmp image, at the server (a node.js server at heroku) it is apparent that a CONNECT request is also being sent by their device. That request is rejected with a 503 response automatically, so it's not a problem as such, and the bmp is still sent to their device, but I'd like to know why those CONNECT requests are being sent at all, and how to stop them. Surely there should be nothing but GET requests?

I've tried this solution to what appears to be a similar problem, but it makes no difference for me.

Note that strURL is to an https server, and I'm using HttpURLConnection (not Https) -- not sure if there is any significance in that.

I'm also not 100% sure the CONNECT requests derive from the above calls, but they certainly happen around the same time as a GET request that delivers the bmp. Maybe it could be generated by the OS somehow, outside of my code? Not sure.

In case it helps, an example log message from heroku, in response to one of the CONNECT requests, is as follows:

Oct 27 14:14:25 example heroku/router: at=error code=H13 desc="Connection closed without response" method=CONNECT path="example.herokuapp.com:443" host=example.herokuapp.com request_id=353e623x-dec4-42x5-bcfb-452add02ecef fwd="111.22.333.4" dyno=web.1 connect=0ms service=1ms status=503 bytes=0

EDIT: it may also be of relevance that the device concerned actually makes two independent GET requests within a short time of each other (completely separate and legitimate requests), but there is only ever a single CONNECT request apparent (around the same time as the pair of GET requests). So it's not as if there is a CONNECT for each GET.

Community
  • 1
  • 1
drmrbrewer
  • 11,491
  • 21
  • 85
  • 181
  • This is what an intentional use of a proxy looks like: http://stackoverflow.com/questions/15927079/how-to-use-httpsurlconnection-through-proxy-by-setproperty – Alex Nauda Oct 31 '15 at 06:51
  • Is there any way in which a use of a proxy can be attempted when my code doesn't have any proxy stuff in it? – drmrbrewer Oct 31 '15 at 14:26
  • Not that I've ever heard of. I agree, it doesn't make sense. And since it's limited to a small subset of clients, and only observed in the wild, it's pretty darn mysterious. I'd suspect a version-specific behavior or bug. Do the nearby-in-time requests tell you anything about the user agent? Or could you add code to gather information about the OS and hardware, and pass that in a custom header on the GET request? – Alex Nauda Oct 31 '15 at 14:41
  • Yep, already have that info: samsung, m0xx, GT-I9300, Android 4.3, launcher com.hola.launcher. Leaving aside the launcher, there are loads of other users with exactly the same combination, without any strange CONNECTs, and also loads of users with that launcher, again without any CONNECTs. – drmrbrewer Oct 31 '15 at 20:19
  • IMO, you can remove the line "connection.connect();" since it will be called inside "getInputStream". – BNK Nov 01 '15 at 04:49
  • 1
    Long shot, but an additional variable to consider is whether the connection is being made from a restricted network that may force outbound connections through a proxy. – bimsapi Nov 02 '15 at 18:08
  • Interesting thought. My knowledge of mobile networks is somewhat limited. My understanding is that use of a proxy is just communicating with your intended node via a designated intermediary (proxy). So if the restricted network is forcing connections through a proxy, wouldn't that just result in a GET request to my node app from the proxy, rather than direct from the device? How would the use of a proxy result in the sending of a CONNECT request to my device? – drmrbrewer Nov 03 '15 at 08:02
  • This is primary check for the proxy servers (SSL Tunnelling). There is a handshake between the client and the proxy to establish the connection between the client and the remote server through the proxy. In order to make this extension be back ward compatible, the handshake must be in the same format as HTTP/1.x requests (CONNECT) so that proxies without support for this feature can still determine the request as impossible for them to service, and give proper error responses (rather than get hung on the connection) [3.1](http://curl.haxx.se/rfc/draft-luotonen-web-proxy-tunneling-01.txt) – JavaGhost Nov 05 '15 at 18:29

1 Answers1

2

The CONNECT method can preface a request to an HTTP server (either a proxy server or an origin server), and it basically means:

"By the way, old chap, you wouldn't mind relaying this stuff I say 'verbatim' to the host/port I happen to mention, would you? No need to actually pay attention to what I'm saying, really."

Usually this would be an instruction to a proxy, to 'get out of the way', and let the requestor (which could be the user-agent OR another proxy) to talk directly to the upstream server.

It's a nice facility to have if there is an otherwise-uncooperative (perhaps outdated) proxy between you and an origin server. It's also handy if you're a hacker and would like a mis-configured origin server to blithely facilitate your entry into the internal network.

However, unless you have perfect knowledge of the network and 'know' that the is only ONE proxy in your path, you'll need to 'stack' the CONNECT header until you get a refusal.

For example:

CONNECT site.example.com 80 HTTP/1.1
CONNECT site.example.com 80 HTTP/1.1
GET /foo HTTP/1.1
Host: site.example.com

.... will either get you through 2 interfering, good-for-nothing, upstream proxies; OR get you through just the 1 that's actually there, and earn you a 503 from the origin-server ... whereupon you'll have to repeat your request with ONE FEWER CONNECT preface-methods.

So that would account for the behaviour seen so far.

However, what isn't clear is WHO is ADDING THE CONNECT PREFACE?! And why don't they like proxies?

It could be:

  1. code on the User-Agent (your Android app on your client's smartphone using HttpUrlConnection or HttpsUrlConnection (used automatically by openConnection() if the URL has an https:// scheme);
  2. any Proxy between the User-Agent and the origin-server, which for some reason is distrustful of its upstream proxies or needs to tunnel HTTPS through a proxy which otherwise only supports HTTP (which is what CONNECT is for)
  3. a Proxy that's been hacked, and is looking for dumb origin-servers to exploit ... but why wait until someone actually needs stuff, to hassle the origin server?

The full content of the CONNECT method, and the source IP for the packet would be interesting. I'm betting on #2 though, and predict that you won't see the CONNECT if you accessed the site via a http:// URL.

There is nothing which you can do about it.

David Bullock
  • 6,112
  • 3
  • 33
  • 43
  • Amazing answer, and all very very plausible. Until now I'd just left it to the node app to reject `CONNECT` requests, which it seems to do automatically with a `503` response. Prompted by your answer, and so that I can examine the headers of the `CONNECT` request, I've now added a method to the node.js server (requests are handled by the express module) to handle `CONNECT`s, though only to dump the headers to the console and send a `503` response. And I'm still waiting for a `CONNECT`... I wonder whether it's a case of a watched pot never boils... or Schroedinger's cat... – drmrbrewer Nov 04 '15 at 23:00
  • Looks like I can't actually examine these CONNECT requests because they are caught outside of my app by the heroku servers. So I won't probably be able to learn much more about them, but it's really really useful to understand what the likely cause is. So thank you. – drmrbrewer Nov 06 '15 at 20:53