I have the following code to scrape all the "href" attribute from all elements in the PlayStation webpage:
https://store.playstation.com/#!/es-...s-store%3Ahome
String url = "https://store.playstation.com/#!/es-es/ps4/cid=STORE-MSF75508-PS4CAT%7Cplatform~ps4%7Cname~asc/";
String url2 = "?smcid=nav%3Aps-store%3Ahome";
int juegos_totales = 0;
ArrayList<String> all_links = new ArrayList<String>();
int z=0;
for (int i=1; i<50; i++) {
String urlPage = url+i+url2;
System.out.println("Comprobando entrada: " + urlPage);
if (getStatusConnectionCode(urlPage) == 200) {
Document document = getHtmlDocument(urlPage);
Elements entradas = document.select("div.gridViewportPaneWrapper li.cellGridGameStandard");
// Paseo cada una de las entradas
for (Element elem : entradas) {
Elements links = elem.getElementsByTag("a");
for (Element link : links ) {
all_links.add(link.attr("href"));
juegos_totales++;
}
z++;
}
System.out.println("Hay un total de " + juegos_totales + " juegos");
}
}
It scrapes nothing I don't know why...if I try to scrape the title PS4 it does. This code should scrape all the links of the webpage.