I try to get the content of some urls by using my java code. The code returns the content for some urls, for example this one: "http://www.nytimes.com/video/world/europe/100000004503705/memorials-for-victims-of-istanbul-attack.html" and it returns nothing for some others. For example this one: "http://www.nytimes.com/2016/07/24/travel/mozart-vienna.html?_r=0" When I check the url manually, I see the content and even if I view the source, I don't notice any special difference between the structure of the pages. But I still get nothing for this url.
Does it relate to any permission problem or the structure of the webpage or my java code?
Here is my code:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
public class TestJsoup {
public static void main(String[] args) {
System.out.println(getUrlParagraphs("http://www.nytimes.com/2016/07/24/travel/mozart-vienna.html?_r=0"));
}
public static String getUrlParagraphs (String url) {
try {
URL urlContent = new URL(url);
BufferedReader in = new BufferedReader(new InputStreamReader(urlContent.openStream()));
String line;
StringBuffer html = new StringBuffer();
while ((line = in.readLine()) != null) {
html.append(line);
System.out.println("Test");
}
in.close();
System.out.println(html.toString());
return html.toString();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
}