5

I want to parse the contents of this HTML table :

Blockquote

Here is the full website with source code:

http://www.kantschule-falkensee.de/uploads/dmiadgspahw/klassen/A_Klasse_11.htm

I want to parse the data for each cell, all 5 cells under "Montag"(Monday) as an example. I tried several ways of parsing this Website using JSOUP but i havent got any succes with it. My main Goal is to show the contents in an listview in an Android app. For now i tried to print the contents in a java console. Both Languages are accepted :). Any Help is appreciated.

Christian Steuer
  • 129
  • 1
  • 1
  • 8

1 Answers1

22

Here are the steps you would need to follow:

1) You could use any of the below java libraries for HTML scraping:

2) Use Xpath helper

Eg 1: Enter "//tr[1]//td[1]" in the query and it will give all table elements at position (1,1)

Eg 2: "/html/body[@class='tt']/center/table[1]/tbody/tr[4]/td[3]/table/tbody/tr/td" Will give you all 15 values under Montag.

Eg 3: "/html/body[@class='tt']/center/table[1]/tbody/tr/td/table/tbody/tr/td" Will give you all 380 entries of the table

OR

Example using Jsoup

import org.jsoup.Jsoup;
import java.io.IOException;

public class Main {
    public static void main(String[] args) throws IOException {
        org.jsoup.nodes.Document doc = Jsoup.connect("http://www.kantschule-falkensee.de/uploads/dmiadgspahw/klassen/A_Klasse_11.htm").get();
        org.jsoup.select.Elements rows = doc.select("tr");
        for(org.jsoup.nodes.Element row :rows)
        {
            org.jsoup.select.Elements columns = row.select("td");
            for (org.jsoup.nodes.Element column:columns)
            {
                System.out.print(column.text());
            }
            System.out.println();
        }

    }
}
Ben McCann
  • 18,548
  • 25
  • 83
  • 101
  • 2
    Perfect answer. I've used jsoup to successfully parse similar tables in badly written HTML. The original poster needs to take more time and patience to study jsoup to get the hang of it. – Basil Bourque Jul 11 '15 at 19:49
  • I am already using selenium to select the table tag, how can i pass that as a jsoup document ? – Murtaza Haji Aug 19 '20 at 14:13