2

I have a method to download image from URL. As like below..

public static byte[] downloadImageFromURL(final String strUrl) {
    InputStream in;
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    try {
        URL url = new URL(strUrl);
        in = new BufferedInputStream(url.openStream());
        byte[] buf = new byte[2048];
        int n = 0;
        while (-1 != (n = in.read(buf))) {
            out.write(buf, 0, n);
        }
        out.close();
        in.close();
    }
    catch (IOException e) {
        return null;
    }
    return out.toByteArray();
}

I have an image url and it is valid. for example.

https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTxfYM-hnD-Z80tgWdIgQKchKe-MXVUfTpCw1R5KkfJlbRbgr3Zcg

My problem is I don't want to download if image is really not exists.Like ....

https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTxfYM-hnD-Z80tgWdIgQKchKe-MXVUfTpCw1R5KkfJlbRbgr3Zcgaaaaabbbbdddddddddddddddddddddddddddd

This image shouldn't be download by my method. So , how can I know the giving image URL is not really exists. I don't want to validate my URL (I think that may not my solution ).

So, I googled for that. From this article ... How to check if a URL exists or returns 404 with Java? and Check if file exists on remote server using its URL

But this con.getResponseCode() will always return status code "200". This mean my method will also download invalid image urls. So , I output my bufferStream as like...

System.out.println(in.read(buf));

Invalid image URL produces "43". So , I add these lines of codes in my method.

    if (in.read(buf) == 43) {
       return null;
    }

It is ok. But I don't think that will always satisfy. Has another way to get it ? am I right? I would really appreciate any suggestions. This problem may struct my head. Thanks for reading my question.

*UPDATE

I call this download method and save downloaded image in some directory as..

            // call method to save image
            FileSupport.saveFile(filePath+".JPG", data);

After that I tried to output as...

            File file = new File(filePath+".JPG);
            System.err.println(file.length());

that may also produces "43" for invalid image urls. I want to know why that return "43" for all of invalid urls. what is "43" ?

Community
  • 1
  • 1
Cataclysm
  • 7,592
  • 21
  • 74
  • 123
  • 1
    you cannot detect an invalid image without downloading it and looking at it. – akonsu Oct 26 '13 at 03:05
  • @Matt Ball , are you sure ? has any image ? did you see ? – Cataclysm Oct 26 '13 at 03:05
  • @akonsu , do you mean I can detect after download ? if so , how to ? – Cataclysm Oct 26 '13 at 03:07
  • you can download it, and check its format. I am sure there are libraries for java that can read image files. if reading fails then it is a bad image. – akonsu Oct 26 '13 at 03:08
  • @akonsu , thanks for your suggestion , pls guide me how to check its format or how to check is it bad image ? – Cataclysm Oct 26 '13 at 03:12
  • Looks like a duplicate of http://stackoverflow.com/questions/1378199/how-to-check-if-a-url-exists-or-returns-404-with-java – Aurand Oct 26 '13 at 03:15
  • @Aurand , I described as similar with your link , it is not my solution. I had tested and always return status code "200". had you test it ? – Cataclysm Oct 26 '13 at 03:41
  • don't be describe my question as similar , if you don't really realize what I asked and what is my question main point. – Cataclysm Oct 26 '13 at 03:43
  • @Cataclysm I got a 404 from the link you posted. – Aurand Oct 28 '13 at 06:36
  • @Aurand , you should know that I really trouble with this problem. You can see how I efforded on this problem . I don't be asked this problem when I get result by your described link. I swear , I don't get status code 404. I also want to get this 404 status code but I get just only 200. – Cataclysm Oct 31 '13 at 03:08

4 Answers4

3

Try this,

Open an image in notepad or something and check the first 3-4 characters, it will tell you the format of the image..

When downloading check the first 3 or 4 characters, that should tell you if this image is valid or not.

Note: Here, I'm assuming that your requirement is specific to certain types of images and not all possible images.

some samples:

‰PNG for PNG images ����JFIF for JPG images.

byte[] tenBytes=new byte[10];
// fill this array with the first 10 bytes.
String str = new String(tenBytes);
if(str.contains("JIFF")){
// JPG
}
if(str.contains("PNG"){
// PNG
} ...

if nothing matches, its either an invalid image or an image you don't want.

Note this is untested code.. you might have to make adjustments for it to work properly. you should look at this as an psuedo code to build your implementation...

Update: Instead of checking for file size 43, you should be looking for the content (as described above).

Anantha Sharma
  • 9,920
  • 4
  • 33
  • 35
  • As you said "When downloading check the first 3 or 4 characters, that should tell you if this image is valid or not." how to figure it out. – Cataclysm Oct 26 '13 at 05:11
  • I updated the answer with a some psuedo code... this should help. – Anantha Sharma Oct 26 '13 at 05:15
  • I appreciated your suggestions..I would like to know that will really safe and certain for either valid or invalid urls ? – Cataclysm Oct 26 '13 at 08:03
  • Based on your original question, you wanted to download images which exist. meaning from URL's which contain an image, the approach I suggested does exactly that.. the validity of the URL is a different matter altogether. htt1p://www.some-site-that-doesnt-exist.com is an invalid URL coz of the wrong protocol, the sample image links u shared does contain images (albeit 1x1 image). are you concerned about the dims of an image when u say valid/invalid? – Anantha Sharma Oct 26 '13 at 08:35
  • Yeah , I mean "invalid" is this image is really not exists on internet. wrong file name or deleted by sites or uploader. But not mean URL error. I can use URLValidator of some java method support. – Cataclysm Oct 26 '13 at 09:05
  • in that case u will recieve a 404, FileNotFoundException. – Anantha Sharma Oct 26 '13 at 09:48
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/40035/discussion-between-cataclysm-and-anantha-sharma) – Cataclysm Oct 26 '13 at 09:54
1

If

con.setRequestMethod("HEAD");

does not help you, you should do something like this (read from connection's input strean will fail if the image does not exist.

  HttpUrlConnection con = (HttpUrlConnection)url.openConnection;
  con.setRequestMethod("GET");
  con.addRequestProperty("User-Agent", "Mozilla/4.0");

  int responseCode = con.getResponseCode(); //if you do not get 200 here, you can stop

  if(responseCode != HttpUrlConnection.HTTP_OK) {
    return;
  }
  // Now, read image buffer
  byte[] image = null;

  try{

       InputStream in = new BufferedInputStream(con.getInputStream());
       ByteArrayOutputStream out = new ByteArrayOutputStream();
       byte[] buf = new byte[1024];

       int n = 0;

       while (-1!=(n=in.read(buf)))
       {
          out.write(buf, 0, n);
       }
       out.close();
       in.close();
       image = out.toByteArray();

    } catch (IOException ioe){
       // do whatever you need
    } finally {
       con.disconnect();
    }

Also, this code

 if (in.read(buf) == 43) {
       return null;
 }

does not look good. Some magic number, not clear what is it.

  • @Nish Chandradas You code may not in some cases (if server requires user agent). You should always set user agent (if you know nothing about the server. If you know for sure that user agent not required, you are OK) and in this case you have too use HttpUrlConnection `HttpUrlConnection con = (HttpUrlConnection)url.openConnection; con.addRequestProperty("User-Agent", "Mozilla/4.0");` – Dmitry Genis Oct 26 '13 at 03:52
  • I tested your code but IOException is not throw by invalid url. are you sure ? I don't found any error by invalid image. – Cataclysm Oct 26 '13 at 04:19
  • when I add it if (in.read(buf) == 43) { return null; } in my code. that may satisfy my problem but as you said this does not look good and I am not sure it way satisfy always. – Cataclysm Oct 26 '13 at 04:32
  • 1
    @Cataclysm Your URL is valid, the image behind URL is invalid. – Dmitry Genis Oct 26 '13 at 15:35
0

This is how I would do it:

//By Nishanth Chandradas

import java.awt.Image;
import java.io.BufferedInputStream;
import java.io.ByteArrayOutputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import javax.activation.MimetypesFileTypeMap;
import javax.swing.ImageIcon;

import java.io.File;

public class downloadimagefromurl {

    /**
     * @param args
     * @throws IOException 
     */

    public static byte[] downloadImageFromURL(final String strUrl) throws IOException {
        InputStream in;
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        try {
            URL url = new URL(strUrl);
            in = new BufferedInputStream(url.openStream());
            byte[] buf = new byte[2048];
            int n = 0;
            while (-1 != (n = in.read(buf))) {
                out.write(buf, 0, n);
            }
            out.close();
            in.close();
        }
        catch (IOException e) {
            return null;
        }
        byte[] response = out.toByteArray();
        FileOutputStream fos = new FileOutputStream("/Users/Nish/Desktop/image.jpg");
        fos.write(response);
        fos.close();
        return response;
    }

    static boolean isImage(String image_path){
          Image image = new ImageIcon(image_path).getImage();
          if(image.getWidth(null) == -1){
                return false;
          }
          else{
                return true;
          }
        }

    public static void main(String[] args) throws IOException {
        downloadImageFromURL("https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTxfYM-hnD-Z80tgWdIgQKchKe-MXVUfTpCw1R5KkfJlbRbgr3Zcg");
     System.out.println(isImage("/Users/Nish/Desktop/image.jpg"));
    }

The output will be true or false depending if the download was an image or not.

  • your code is firstly download and save and then check image isExists. I can do as it. Do you mean download is required either image is valid or invalid by its url ? – Cataclysm Oct 26 '13 at 04:18
  • image.getWidth(null) will never got "-1" for invalid url image. – Cataclysm Oct 26 '13 at 04:54
-1

You can add a second catch statement to catch java.io.FileNotFoundException

catch (FileNotFoundException e) {
    // Failed
}
Code Monkey
  • 1,785
  • 11
  • 13