0

I have parsed an HTML using Jsoup and obtained Elements in items1 items2 and items3. I want create an ArrayList> to use later to populate a ListView.

The problem is that I can not fill the HashMap and/or ArrayList properly.

I tried to use

Elements itemLine = document.select("selector1");
Elements itemsTime = document.select("selector2");
...

for (Element itemLine : itemLines) {

         hashMap.put("itemLine", itemLine.text());
}
...

arrayList.add(hashMap); 

Html:

<table cellspacing="1" cellpadding="0" rules="all" border="0" id="dtgHorasPorLinea" style="border-width:0px;width:100%;">
    <tr>
        <td class="Minutes_css" align="center" valign="middle" style="height:45px;width:90px;">26 <span style='font-size:16px'>min.</span>
        </td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:45px;">
            <img id="dtgHorasPorLinea_ctl02_imgLineas" src="img/08_mobile_logo_interurb-n.jpg" style="border-width:0px;" />
        </td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:95px;">
            <span id="dtgHorasPorLinea_ctl02_lblNumLine">N802</span>
        </td>
        <td class="Descrip_td_sp" align="left" valign="middle" style="height:45px;width:944px;">TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO</td>
    </tr>
    <tr>
        <td class="Minutes_css" align="center" valign="middle" style="height:45px;width:90px;">02:15</td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:45px;">
            <img id="dtgHorasPorLinea_ctl03_imgLineas" src="img/08_mobile_logo_interurb-n.jpg" style="border-width:0px;" />
        </td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:95px;">
            <span id="dtgHorasPorLinea_ctl03_lblNumLine">N804</span>
        </td>
        <td class="Descrip_td_sp" align="left" valign="middle" style="height:45px;width:944px;">INTERCAMBIADOR DE ALUCHE</td>
    </tr>
    <tr>
        <td class="Minutes_css" align="center" valign="middle" style="height:45px;width:90px;">02:37</td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:45px;">
            <img id="dtgHorasPorLinea_ctl04_imgLineas" src="img/08_mobile_logo_interurb-n.jpg" style="border-width:0px;" />
        </td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:95px;">
            <span id="dtgHorasPorLinea_ctl04_lblNumLine">N802</span>
        </td>
        <td class="Descrip_td_sp" align="left" valign="middle" style="height:45px;width:944px;">TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO</td>
    </tr>
    <tr>
        <td class="Minutes_css" align="center" valign="middle" style="height:45px;width:90px;">04:15</td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:45px;">
            <img id="dtgHorasPorLinea_ctl05_imgLineas" src="img/08_mobile_logo_interurb-n.jpg" style="border-width:0px;" />
        </td>
        <td class="Linea_Interurbana_Nocturna_css" valign="middle" style="height:45px;width:95px;">
            <span id="dtgHorasPorLinea_ctl05_lblNumLine">N804</span>
        </td>
        <td class="Descrip_td_sp" align="left" valign="middle" style="height:45px;width:944px;">INTERCAMBIADOR DE ALUCHE</td>
    </tr>
</table>

These are the selectors for jsoup

itemDestination : td.Descrip_td_so itemLine : td:eq(2) item time : td.Minutes_css

In this image I had tried to explain it: https://i.stack.imgur.com/Z6sTN.jpg

What can I do? Thanks in advance!

Raf
  • 7,505
  • 1
  • 42
  • 59
makgyverzx
  • 39
  • 6

2 Answers2

1
import org.jsoup.*;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.util.List;
import java.util.ArrayList;

/*
 * Just a class with three attributes with three properties. If you have 500 hundred rows
 * Each row can be represented as Object of ItemElements class, so 500 objects. 
 * Getters and setters are to allow access to the properties. 
 */
class ItemElements {
  String itemLine;
  String itemTime;
  String itemDescription;

  public String getItemLine() {
    return this.itemLine;
  }
  public String getItemTime() {
    return this.itemTime;
  }
  public String getItemDescription() {
    return this.itemDescription;
  }

  public void setItemLine(String itemLine) {
    this.itemLine = itemLine;
  }

  public void setItemTime (String itemTime) {
    this.itemTime = itemTime;
  }

  public void setItemDescription(String itemDescription) {
    this.itemDescription = itemDescription;
  }
}
/* 
 * I have named the class as TestJSOUP, if you save this code as make sure that 
 * The below class name corresponds with the filename. Also make sure that you 
 * have the jsoup library in your path. 
 */
public class TestJSOUP {
        public static void main(String[] args) {
            String html = " <table cellspacing='1' cellpadding='0' rules='all' border='0' id='dtgHorasPorLinea' style='border-width:0px;width:100%;'><tr><td class='Minutes_css' align='center' valign='middle' style='height:45px;width:90px;'>26 <span style='font-size:16px'>min.</span></td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:45px;'> <img id='dtgHorasPorLinea_ctl02_imgLineas' src='img/08_mobile_logo_interurb-n.jpg' style='border-width:0px;' /></td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:95px;'><span id='dtgHorasPorLinea_ctl02_lblNumLine'>N802</span></td><td class='Descrip_td_sp' align='left' valign='middle' style='height:45px;width:944px;'>TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO</td></tr><tr><td class='Minutes_css' align='center' valign='middle' style='height:45px;width:90px;'>02:15</td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:45px;'><img id='dtgHorasPorLinea_ctl03_imgLineas' src='img/08_mobile_logo_interurb-n.jpg' style='border-width:0px;' /></td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:95px;'><span id='dtgHorasPorLinea_ctl03_lblNumLine'>N804</span></td><td class='Descrip_td_sp' align='left' valign='middle' style='height:45px;width:944px;'>INTERCAMBIADOR DE ALUCHE</td></tr><tr><td class='Minutes_css' align='center' valign='middle' style='height:45px;width:90px;'>02:37</td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:45px;'><img id='dtgHorasPorLinea_ctl04_imgLineas' src='img/08_mobile_logo_interurb-n.jpg' style='border-width:0px;' /> </td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:95px;'><span id='dtgHorasPorLinea_ctl04_lblNumLine'>N802</span></td><td class='Descrip_td_sp' align='left' valign='middle' style='height:45px;width:944px;'>TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO</td></tr><tr> <td class='Minutes_css' align='center' valign='middle' style='height:45px;width:90px;'>04:15</td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:45px;'><img id='dtgHorasPorLinea_ctl05_imgLineas' src='img/08_mobile_logo_interurb-n.jpg' style='border-width:0px;' /></td><td class='Linea_Interurbana_Nocturna_css' valign='middle' style='height:45px;width:95px;'><span id='dtgHorasPorLinea_ctl05_lblNumLine'>N804</span></td><td class='Descrip_td_sp' align='left' valign='middle' style='height:45px;width:944px;'>INTERCAMBIADOR DE ALUCHE</td></tr></table>";

/* 
 * Document parses the html 
 */         
Document doc = Jsoup.parse(html);
                    //retrieve all elements having the class "Minutes_css" 
                    Elements tdMinutes  = doc.getElementsByClass("Minutes_css");
                    //retrieves all elements having class as Descript_td_sp 
                            Elements tdDescription = doc.getElementsByClass("Descrip_td_sp");
                                          //retrieves span within second td element of each tr 
                                            Elements tdLines = doc.select("tr td:eq(2) span");

                                                            //This loop is for testing purpose 
                                                            for(Element line: tdLines) {
                                                                        System.out.println("Element line is: " + line.text());
                                                                                    }
                                                                                                    //these lines are to verify that you have got the correct number of elements 
                                                                                                    System.out.println("tdMinutes size: " + tdMinutes.size());
                                                                                                            System.out.println("tdDescription size: " + tdDescription.size());
                                                                                                                    System.out.println("tdLine size: " + tdLines.size());

//These foreach loops are also for testing purpose, shows you what the jsoup captured
for(Element e: tdMinutes) {
System.out.println("tdMinute is: " + e.text());
}

for(Element e: tdDescription) {
System.out.println("tdDescription: " + e.text());
}

/*
 * This is a list of ItemsElements, meaning it stores more than one ItemsElement where 
 * each ItemsElemen stores the three values of your choice
 */
List<ItemElements> allItemElements = new ArrayList<ItemElements>();

/*
 * The following Loop iterates through captured elements using jsoup and use them to 
 * construct an instance of ItemsElement and then at the end of the loop round
 * It adds the constructed ItemsElement object to the list 
 */
for(int i=0; i<tdMinutes.size(); i++) {
    ItemElements e = new ItemElements();
    e.setItemLine(tdLines.get(i).text());
    e.setItemTime(tdMinutes.get(i).text());
    e.setItemDescription(tdDescription.get(i).text());

    allItemElements.add(e);
}


System.out.println("############## all ItemElements size: \n");

/*
 * The following loops through the List ItemsElements list and print their 
 * values for you. 
*/
int counter = 0;
for(ItemElements element: allItemElements) {
   System.out.println(counter + "# \n");
   System.out.println("Item line: " + element.getItemLine());
   System.out.println("Item time: " + element.getItemTime());
   System.out.println("Item Description: " + element.getItemDescription());
   i++; 
}                                                                                       }
                                                                                                                        }

Output

Element line is: N802
Element line is: N804
Element line is: N802
Element line is: N804
tdMinutes size: 4
tdDescription size: 4
tdLine size: 4
tdMinute is: 26 min.
tdMinute is: 02:15
tdMinute is: 02:37
tdMinute is: 04:15
tdDescription: TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO
tdDescription: INTERCAMBIADOR DE ALUCHE
tdDescription: TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO
tdDescription: INTERCAMBIADOR DE ALUCHE
############## all ItemElements size: 

0# 

Item line: N802
Item time: 26 min.
Item Description: TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO
1# 

Item line: N804
Item time: 02:15
Item Description: INTERCAMBIADOR DE ALUCHE
2# 

Item line: N802
Item time: 02:37
Item Description: TORRES QUEVEDO-GUARDIA CIVIL TRÁFICO
3# 

Item line: N804
Item time: 04:15
Item Description: INTERCAMBIADOR DE ALUCHE

Raf
  • 7,505
  • 1
  • 42
  • 59
  • In case that is helpful then, refer to this great question with proper explanations. http://stackoverflow.com/questions/4956844/hashmap-with-multiple-values-under-the-same-key – Raf Aug 30 '14 at 20:45
  • +1. keep up the good work – Alkis Kalogeris Aug 30 '14 at 21:13
  • Thanks for your answer. Sorry for my bad explanation. I want populate a ListView using the elements obtained from parsing the HTML. Each element in Listview must have 3 elements (itemLine, itemTime and itemDestination). I thought use an arrayList with a hashmap but I'm thinking that could not be a good idea. I've edited my question with a new image. What would you use to populate the ListView? – makgyverzx Aug 30 '14 at 21:55
  • I have updated my answer. I think creating a class to represent the three elements would be the best way to go forward. – Raf Aug 30 '14 at 22:07
  • Thanks for your help. But in the class with getters and setters. Where is declared the jsoup items. Sorry for my question. I don't understand very well getters and setters. Thanks! – makgyverzx Aug 30 '14 at 22:17
  • JSOUP Items -> List jsouItems = new ArrayList(); I assume that JSOUP Items is a list composed of JSOUP Item that in turn is composed of three elements (itemLine, itemTime, and itemDescription) ... Use the class to model JSOUP Item and then to store more than one item, you will create an ArrayList of type that class ... why don't you copy a sample of the html as well to the question. – Raf Aug 30 '14 at 22:48
  • I've updated my question with HTML and Jsoup selectors. Thanks – makgyverzx Aug 31 '14 at 15:11
  • I have updated my answer with complete Java Class to do exactly what you want. See the output. Don't forget to mark the question as Resolved. – Raf Aug 31 '14 at 20:50
  • Wow!! I have no words! Thanks you so much! Now, I understand getters, setters and Jsoup a bit more! PD: Is now the question understandable? – makgyverzx Sep 01 '14 at 01:39
  • You are welcome. I have requested an update to your question to make it more understandable. It is subject to approval. Good luck, there is plenty of documentation about JSOUP and Java classes. – Raf Sep 01 '14 at 11:26
0
for (Element item1 : items1) {

         hashMap.put("item1", item1.text());
}

You are using the same key for all the items. Try something like this

for (int i = 0; i < items1.size(); i++) {

         hashMap.put("item"+i, item1s.get(i).text());
}
Alkis Kalogeris
  • 17,044
  • 15
  • 59
  • 113
  • Yes. Items in item1 need has the same key in order to populate listview later. And items in item2 and item3 need have item2 and item3 values respectively. – makgyverzx Aug 30 '14 at 20:19
  • 2
    When you call `put` with the same key, then you overwrite the previous one. In the end, you hashMap will have only the last item1.text() and not all of them – Alkis Kalogeris Aug 30 '14 at 20:32