I'm trying to use Jsoup to extract some information from a web site, but I don't know how to access to the date content at the bottom of the code. I used the select command with "div", but it doesn´t works. How can I do this? Thanks!
Asked
Active
Viewed 76 times
1 Answers
0
From the image that you have in your query, it appears like you are trying to fetch the date within the 'br', br - is line break. Even by using CSS we have nothing to fetch under this. Hence a workaround could be tried, something similar to take the text under the tag "small" and split it and take the second part. You need to inspect your DOM more closely and check out for failures with this approach. For the limited html available in the image, you can use the following:
String[] text = doc.select("div > small").text().split("\"");
System.out.println(text[1]);

Krishna
- 198
- 6
-
I tried that but I couldn't get anything. I'm really new in web scraping and I don't know much about it, but I think that is releated with the website and how this was configured (http://bulletin.evoting.cl/?id=CAbjRuuL) Thank you so much for you help! – Tamara Andrea Vejar Ferrada Jun 30 '18 at 01:34
-
You don't see anything fetched because the page is loaded by javascript. If you inspect the page while it is loading you can notice that there are few java scripts which will load the content like src="inline.ec865decf2a2cd6e672c.bundle.js". Jsoup on its own will not help you in this case as this is just a parser of a loaded page and does not embed the items like a browser. You can find more references to solve this in SO [link] (https://stackoverflow.com/questions/7488872/page-content-is-loaded-with-javascript-and-jsoup-doesnt-see-it) – Krishna Jun 30 '18 at 03:50
-
with Selenium I could do it. Thanks a lot! – Tamara Andrea Vejar Ferrada Jul 02 '18 at 15:52