I am reading text from URL using Jsoup. Following link has some tips to preserve new lines when converting the body to text How do I preserve line breaks when using jsoup to convert html to plain text?
I use following lines to convert the tags
String prettyPrintedBodyFragment = Jsoup.clean(body, "", Whitelist
.none().addTags("br", "p", "h1"), new OutputSettings()
.prettyPrint(true));
System.out.println(prettyPrintedBodyFragment);
I still get the body/content in single line. Any clues pl?
EDIT: Here is the complete source code and I see output in only 1 line
public static void main(String[] args) throws Exception {
Connection conn = Jsoup.connect("http://finance.yahoo.com/");
Document doc = conn.get();
String body = doc.body().text();
String prettyPrintedBodyFragment = Jsoup.clean(body, "", Whitelist
.none().addTags("br", "p", "h1"), new OutputSettings()
.prettyPrint(true));
System.out.println(prettyPrintedBodyFragment);
}