0

I need some help to keep line breaks when parsing html using Jsoup.

I have already tried researching and trying things that were on this website, however could not find any of them to work.

I am very new to coding, so easy explanations are more welcomed.

Thanks in advance!

public class MainActivity extends AppCompatActivity {
TextView content;

@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);

    content=(TextView)findViewById(R.id.content0);


    Button but=(Button) findViewById(R.id.but1);
    but.setOnClickListener(new View.OnClickListener() {
        @Override
        public void onClick(View v) {
            System.out.println("parse button pressed");
            new doit().execute();
        }
    });

}

public class doit extends AsyncTask<Void,Void,Void>{
    String words;

    @Override
    protected Void doInBackground(Void... params) {
        System.out.println("parsing");
        try {
            Document doc = Jsoup.connect("http://daltonschool.kr/homeeng/04schoollife/040203schoollife.html").get();
            words=doc.select("table.cafeteria tbody tr td").eq(3).text();
        }catch(Exception e){e.printStackTrace();}
        return null;
    }
    @Override
    protected void onPostExecute(Void aVoid) {
        super.onPostExecute(aVoid);
        content.setText(words);
    }
}

}

thok0831
  • 23
  • 1
  • 3
  • @ashatte I tried this but it gives me error on return Jsoup.clean(s, "", Whitelist.none(), new Document.OutputSettings().prettyPrint(false)); saying it requires java.lang.Void whilie it was found java.lang.String. I have no idea how to apply this. – thok0831 Feb 13 '17 at 02:46

1 Answers1

3

I have tried in this way to preserve <br>. I dont know the best idea its like a hack things.

public class Test {
    public static void main(String[] args) {
        try {
            Document doc = Jsoup.connect("http://daltonschool.kr/homeeng/04schoollife/040203schoollife.html").get();
            String words = doc.select("table.cafeteria tbody tr td").eq(3).html();
            String temp = words.replace("<br>", "$$$");
            Document doc1 = Jsoup.parse(temp);
            String text = doc1.body().text().replace("$$$", "\n").toString();
            System.out.println(text);

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

}

Output:

-Korean Food-
 Spicy Stir-fried Pork&Kimchi w/Rice 
 Kelp&Radish Soup 
 Kkakdugi 
 *Salad Bar:Spaghetti S

 -Western Food-
 Hurigake Rice 
 Sweet Chili Chicken
 *Salad Bar:spaghetti S
 (Veg: Pollack Pancake)
soorapadman
  • 4,451
  • 7
  • 35
  • 47