0

Heu guys i have an app where i'm parsing an html file using jsoup to display selected texts in a listview in android. However i cant seem to find a way to keep the carriage returns.
Here is what i have attempted :

Elements br = doc.select("br");

    for (Element src : br) {
         src.append("\n");
     }

To give you an example with a string

<div
This is a string <br> another string
/>

Is parsed and displays :

This is a string another string

I have tried using

src.append("\\n");

Which displays

This is a string \n another string

I am using an arraylist of strings to store these variables.

Ive been trying to find a solution to this problem with no luck i have attempted solutions from the following threads:

How do I preserve line breaks when using jsoup to convert html to plain text?

Removing HTML entities while preserving line breaks with JSoup

Community
  • 1
  • 1
Jonh Smith
  • 65
  • 1
  • 9

1 Answers1

0

When you tried src.append("\\n"); it looks like you had just one backslash to much.

Regardless, the first link you posted should point you in the right direction, something like this should work:

// parse the doc and select the element containing the text
Elements es = Jsoup
    .parse("<html><body><div>a \ntext<br/>is <b>a</a> text</div></html></body")
    .select("div");

// find <br> tags and replace them (using an arbitrary placeholder '~n~')
es.select("br").append("~n~");
// clean all tags
String clean = Jsoup.clean(es.html(), Whitelist.none());
// replace the placeholder with a real newline
String disp = clean.replaceAll("~n~", "\n");

Now disp will print:

a text 
is a text
nyname00
  • 2,496
  • 2
  • 22
  • 25