I am trying to read an html link that contains something like this
<html>
<head>
<title>
Title
</title>
</head>
<body>
Name1 Age1 Hometown1<br>
Name2 Age2 Hometown2<br>
Name3 Age3 Hometown3<br>
</body>
</html>
with method readData(String[] urls) where String[] urls is an array of strings, strings being one or more urls. Now I'm only interested in what's in the html body of each url, hence I used while .readLine!=null
and .contains("<br>")
. However, it appears that my code can only read the first line of the body block (starting with line after <body>
, as I want) and does not go on to the lines after until the </body>
. How would I make my code read past the first line?
public void readData(String[] urls) {
for (int i=0; i<urls.length; i++) {
String str="";
try {
URL url=new URL(urls[i]);
URLConnection conn=url.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String s;
while (( s = in.readLine())!=null)
if (s.contains("<br>")) {
str += s;
}
} catch(Exception e) {
e.printStackTrace();
}
}
}
EDIT1: The issue appears to be that the entire input is coming in as one line rather than multiple lines, as it should be. How would I partition that one line into multiple lines so that I can read each?
EDIT2:
Thanks everyone. I've figured that out. I still use the single long input of String but I just partition it into a String array using .split()
and read each element of that. However, there is a new problem now. for my String[] urls, I am only reading the first element. I cannot read anything beyond the first String urls element when actually I want to read all the String elements in urls. Any ideas?