0

im trying to analyze different websites to find all of the images it contains.

Now for this im using Jsoup with the following code:

      Elements imagePath = doc.select("[src]");
      e.attr("abs:src")

Now when i run this on a domain name i get alot of images but if i try to run the same thing on a sub domain i get the same images

for instance the site http://www.example.com would preduce the same output as http://www.example.com/page1

Now my question is does JSoup find all images for all subsites to a domain or is it just random luck that it preduces the same output?

Marc Rasmussen
  • 19,771
  • 79
  • 203
  • 364

1 Answers1

1

Are you updating your Document object? My guess is (since there is no valuable code provided) that you have parsed your domain into doc and you did not do the same for subdomain. Jsoup applies your select only to current document node and have nothing to do with subdomains/pages etc. (Since it doesn't even has to be a website).

Antoniossss
  • 31,590
  • 6
  • 57
  • 99
  • I just figured out the problem you are right its only for 1 page at a time thank you! :) – Marc Rasmussen Aug 06 '13 at 08:14
  • @Antoiossss can you look at my other Jsoup question maybe you have an answer for that aswell http://stackoverflow.com/questions/18075085/jsoup-getting-background-image-path-from-css – Marc Rasmussen Aug 06 '13 at 08:42