0

I am trying to get all of the links to the songs on the side of google after you search for the band and album ex: https://www.google.com/search?q=disturbed+asylum&ie=utf-8&oe=utf-8

So I have tried quite a bit on my own. I've used:

File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");
Elements links = doc.select("a[href]");

and then printed out all of the links, but it didn't get the links to the side this way. I then tried to get the html code and then search for the table through there but it didn't pull the table for some reason. Anyone have any suggestions?

Kasarrah
  • 315
  • 2
  • 4
  • 14

1 Answers1

0

I think your question has been asked here : java crawlers

Is that what you were looking for?

Community
  • 1
  • 1
almeynman
  • 7,088
  • 3
  • 23
  • 37
  • I already have a way that gets the html code, the thing is for some reason it might be counting that right half of google as a different section because the table doesnt show up there and it doesnt grab the links – Kasarrah Apr 09 '15 at 19:20
  • So you cannot extract links? – almeynman Apr 09 '15 at 19:52
  • I can get a list of links but its not giving me the links in the table i mentioned – Kasarrah Apr 09 '15 at 21:16
  • try using xpath, you can access you element knowing the class name of it, so in your case it would be `xpath("//div[@class=\"someclass\""]` – almeynman Apr 09 '15 at 21:56
  • [related article](http://xmlquerying.blogspot.com/2012/12/declarative-crawling-with-java.html) – almeynman Apr 09 '15 at 21:59