0

I have an issue with a HTTP client I am making. I have implemented a cache in the form of a HashMap to prevent a re-download of the exact same file if it has been already downloaded. I also want to be able to update the cache if the file is already downloaded. Here is the code:

try
    {
        long lastMod = getLastModified(url);
        Date d = new Date(lastMod);

        outputStream.print("HEAD "+ "/" + pathName + " HTTP/1.1\r\n");
        outputStream.print("If-Modified-Since: " + ft.format(d)+ "\r\n");
        outputStream.print("Host: " + hostString+"\r\n");
        outputStream.print("\r\n");
        outputStream.flush();

        String t;
        while ((t = inputStream.readLine()) != null)
            dataIn.add(t);
    }
    catch(NullPointerException e)
    {
        dataIn.add("Handle Exception: 200");
    }
    catch(RuntimeException e2)
    {
        dataIn.add("Handle Exception: 200");
    }

    if(dataIn.get(0).contains("304"))
    {
        //Not Modified
        System.out.println(dataIn.get(0));
        System.out.println(hostString + "/" + pathName + " is already in local directory\nand is up to date");
    }
    else if(dataIn.get(0).contains("200"))
    {
        for(String x : dataIn)
            System.out.println(x);
        dataIn.clear();

        outputStream.print("GET "+ "/" + pathName + " HTTP/1.1\r\n");
        outputStream.print("Host: " + hostString+"\r\n");
        outputStream.print("\r\n");
        outputStream.flush();

        boolean blankDetected = false;
        int blankIndex = 8;

        boolean lastModDetected = false;
        int lastModIndex = 0;

        String t;
        int count = 0;
        /*
         * The issue is here where the dataIn ArrayList is empty after the loop.
         */
        while ((t = inputStream.readLine()) != null)
        {
            dataIn.add(t);
            if(t.equals("\r\n") && !blankDetected)
            {
                blankDetected = true;
                blankIndex = count;
            }
            if(t.contains("Last-Modified:") && !lastModDetected)
            {
                lastModDetected = true;
                lastModIndex = count;
            }
            count++;
        }

I am using socket connections for the outputstream, and using the loop right at the beginning of the 200 else if, I have verified that the very first HEAD request works perfectly fine.

dataIn however, after the loop, still has 0 elements. Can anyone please help?

  • 1
    You didn't say to keep the connection alive, so the server closed the connection after the `HEAD` response. I would strongly suggest that you use an [HTTP client library](https://stackoverflow.com/q/1322335/5221149). And/or **learn** a lot more about how HTTP works, e.g. what if server uses etags instead of last modified dates? – Andreas Oct 04 '17 at 19:15
  • 2
    Why send `HEAD` request? Just give the `If-Modified-Since` header on the `GET` requests, and server will send `304` without payload if it is unchanged. That is what web browsers to. – Andreas Oct 04 '17 at 19:18
  • 1
    @Andreas: the code is sending HTTP 1.1 requests, so `Connection: keep-alive` is the default if no explicit `Connection` header is present. However, you do need to check the `Connection` header of the response to know whether a keep-alive is actually in effect or not. But I agree that an actual HTTP client library should be used instead of implementing HTTP manually. For instance, using `while ((t = inputStream.readLine()) != null)` is not even close to being the correct way to read an HTTP response. – Remy Lebeau Oct 04 '17 at 19:28
  • @RemyLebeau Dang, I forgot keep-alive is the default in 1.1. Been that long since I looked at low-level HTTP without a library. You're right in pointing out the `readLine() != null` loop issue, because that will only ever end if connection is closed by server (e.g. timeout) and you'd know because of the long delay, or more likely code throws exception, and how would anyone know if an exception is thrown given that **exceptions are ignored**. – Andreas Oct 04 '17 at 19:46
  • 1
    @Andreas: well, that, and the fact that HTTP message bodies are not line-based data, only HTTP headers are. Once you get past the headers and reach the body, there are many different ways the data could be encoded that require different ways of reading from the socket. `readLine()` is almost NEVER the right method to use. You have to analyze the headers in order to know how to read the rest of the message correctly, if at all. – Remy Lebeau Oct 04 '17 at 21:07

0 Answers0