1

I have written a Servlet that should act like a web-proxy. But some of the Javascript GET calls only return part of the original content when I am loading a page, like localhost:8080/Proxy?requestURL=example.com.

When priting the content of the java script to the console, they are complete. But the response at the browser is truncated.

I am writing like this:

ServletOutputStream sos = resp.getOutputStream();
OutputStreamWriter writer = new OutputStreamWriter(sos);
..
String str = content_of_get_request
..
writer.write(str);
writer.flush();
writer.close();

The strange thing is, when I request directly the Javascript that was loaded during the page request like this:

localhost:8080/Proxy?requestURL=anotherexaple.com/needed.js

The whole content is returned to the browser.

It would be great if someone had an idea. Regards

UPDATE:

The problem was the way how I created the response String:

while ((line = rd.readLine()) != null)
{
    response.append(line);
}

I read one line from a Stream and appended it on a StringBuffer, but it appears that firefox and chrome had a problem with that. It seems that some browsers implement a maximum line length for JavaScript, however there is no maximum line length mentioned in the RFC HTTP 1.1 standard.

Fix:

Just adding a "\n" to the line fixes the issue.

response.append(line+"\n");
Quick n Dirty
  • 569
  • 2
  • 7
  • 15

1 Answers1

0

Because what you are doing is just reading the Html Response , but you are not actually calling the other resources that are referenced in the HTML like images, js etc.

You can observe that when you monitor how the browser renders the html though Firebug for Firefox.

1) The browser receives Html response.

2)Then it parses for referenced resources and make a separate Get call for each of those.

So in order for proxy to work you need to mimick this browser behavior.

My Advice is to use a already available open source libs HTML Unit

Sudhakar
  • 4,823
  • 2
  • 35
  • 42
  • why should my proxy act like a browser? i only redirect a get request, and replace all links and src tags inside of the content and return it to the browser. the browser will perform all the requests by itself. and for most webpages that works fine. – Quick n Dirty Mar 08 '13 at 09:32
  • Well that depends on how you implement a proxy. Now that you have clarified , have you checked Urls are indeed fired from the browser End , you can check that through a plugin like Firebug plugin incase you firefox – Sudhakar Mar 08 '13 at 09:40
  • 200 for the javascript get calls – Quick n Dirty Mar 08 '13 at 09:48
  • Could it be because of Browser caching.What happens when you save it as an html file from the browser. Does it download Js files locally ? – Sudhakar Mar 08 '13 at 09:55
  • Also consider turning off browser caching , pls check this SO :http://stackoverflow.com/questions/1341089/using-meta-tags-to-turn-off-caching-in-all-browsers – Sudhakar Mar 08 '13 at 10:05