I am new in scraping. I am trying, to scrape data from a site using JSOUP. I want to scrape data in from tags like <div>
, <span>
, <p>
etc. Can anybody tell me how to do this?
Asked
Active
Viewed 7,844 times
-3

Pshemo
- 122,468
- 25
- 185
- 269

Muhammad Waqas
- 23
- 1
- 6
-
3Please tell us, what you have tried so far, SO is not the place for getting code magically. – Zhedar May 10 '15 at 17:05
-
1http://jsoup.org/cookbook/ – Jeffrey Bosboom May 10 '15 at 17:11
-
i have just made a new project and added a jsoup jar file and established a connection. i am actually new to this. i want to scrap data residing in different tags as i have shown above. plzzz help me – Muhammad Waqas May 10 '15 at 17:26
1 Answers
2
Check this. A basic example:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class Test {
public static void main(String[] args) throws Exception {
String url = "https://stackoverflow.com/questions/2835505";
Document document = Jsoup.connect(url).get();
String text = document.select("div").first().text();
System.out.println(text);
Elements links = document.select("a");
for (Element link : links) {
System.out.println(link.attr("href"));
}
}
}
This will first print the text of the first div
on the page, and then print out all the url of all links (a
) on the page.
To get div's with specific class, do Elements elements = document.select("div.someclass")
To get divs with a specific id, do Elements elements = document.select("div#someclass")
If you want to go through all the selected elements, do this:
for (Element e:elements) {
System.out.println(e.text());
//you can also do other things.
}

Community
- 1
- 1

Jonas Czech
- 12,018
- 6
- 44
- 65
-
thanks JonasCz dear this was about first what about other div's and div with particular class names and ids – Muhammad Waqas May 10 '15 at 19:38
-
@MuhammadWaqas, If my answer helped you, it would be nice to _accept_ it by clicking the checkmark next to it :-) – Jonas Czech May 11 '15 at 11:26