6

I want to create a small program in which the user will provide me with the url and then he/she will get the images present in that webpage. Below is the code from which I have started:

    URL postURL = new URL(url);
    InputStream inputStream = postURL.openStream();
    BufferedReader br = new BufferedReader(new InputStreamReader(
            inputStream));
    String line;
    StringBuilder sb = new StringBuilder("");
    while ((line = br.readLine()) != null) {
        sb.append(line);
    }
    loggerService.log(sb.toString());
    return null;

It will provide me with the html of the webpage and search for the <img> tags. But what if the url provided to me is something like Direct Image Link which contains the direct image, as it will not contain html. How to tackle this? Also, I'm eager to find out some apis that I could use.

Thanks in advance.

Mr Lister
  • 45,515
  • 15
  • 108
  • 150
Pallav Jha
  • 3,409
  • 3
  • 29
  • 52
  • Use a headhless browser to retrieve the proper DOM and then start getting the images from there. – Luiggi Mendoza Nov 13 '15 at 15:48
  • At least one hack you could try would be, look for META information tags, if they are not there you could assume it is not HTML. – kosa Nov 13 '15 at 15:49
  • 2
    Here is an answer with what I am suggesting (refer answer by kit) http://stackoverflow.com/questions/4602060/how-can-i-get-mime-type-of-an-inputstream-of-a-file-that-is-being-uploaded – kosa Nov 13 '15 at 15:57

1 Answers1

0

I think jsoup is what your looking for, at least in terms of an API. HTML parsing is made very convenient here.

You can real more about Jsoup: here
You can find some tutorials and detailed explanation: here

JGCW
  • 1,509
  • 1
  • 13
  • 25