0

I'm attempting to parse out the html tags from a Java string and that is working fine using the below Jsoup parse method. The only thing is that when I call the .text method it removes the line breaks ("\n") tags. I want to keep those but still have the method return a String, any ideas?

 private static String stripHTML(String html) {
     return Jsoup.parse(html).text();
 }
c12
  • 9,557
  • 48
  • 157
  • 253

1 Answers1

1

Newlines aren't any different from spaces (or consecutive spaces or tabs) in HTML. What you pull out won't have any semantic meaning. <p> and <br />, on the other hand...

David Ehrmann
  • 7,366
  • 2
  • 31
  • 40
  • While this is true, see http://stackoverflow.com/a/12580364/14731 or http://stackoverflow.com/q/5640334/14731 if you want to preserve newlines. – Gili Aug 28 '15 at 15:20