I am using Java on AWS Lambda to get the URL source code of the site. I have the following code:
URL yahoo = new URL(url);
URLConnection yc = yahoo.openConnection();
yc.addRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");
BufferedReader in = new BufferedReader(newInputStreamReader(yc.getInputStream(), "UTF-8"));
String inputLine;
StringBuilder a = new StringBuilder();
while ((inputLine = in.readLine()) != null)a.append(inputLine);
in.close();
System.out.println(a.toString());
With some sites, the code runs absolutely fine. It runs fine every time on my local machine. However, when running on AWS Lambda, it gets stuck on the following part:
BufferedReader in = new BufferedReader(new InputStreamReader(yc.getInputStream(), "UTF-8"));
Then I get: Task timed out after 20.00 seconds.
In the Lambda log, I get the following error:
Payload: java.nio.HeapByteBuffer[pos=0 lim=115 cap=115]
My guess is, does it have something to do with encoding? Why some site are processed absolutely fine and with some it gets stuck on that line of code?
Thanks a lot for all answers.