1

I have a text file containing lots of image urls line by line. I need to get a Java code for automatically extracting those images and saving those images into a file. I know how to save the image from a single URL, but how could I modify the code to do multi threading? I want to get all the images under a single folder with its original file name. I tried to google out many codes, but everything was a failure. Please help me to find a solution. Answers will be highly appreciated.

The code I used to save a single image is:

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;

public class SaveImageFromUrl {
    public static void main(String[] args) throws Exception {
        String imageUrl = "http://http://img.emol.com/2015/04/25/nepalterremoto02ok_2260.jpg";
        String destinationFile = "/home/abc/image.jpg";
        saveImage(imageUrl, destinationFile);
    }

    public static void saveImage(String imageUrl, String destinationFile) throws IOException {
        URL url = new URL(imageUrl);
        InputStream is = url.openStream();
        OutputStream os = new FileOutputStream(destinationFile);

        byte[] b = new byte[2048];
        int length;

        while ((length = is.read(b)) != -1) {
            os.write(b, 0, length);
        }
        is.close();
        os.close();
    }
}
Tom
  • 16,842
  • 17
  • 45
  • 54
  • 1
    `ImageIO.read` and `ImageIO.write` ([Reading/Loading an Image](http://docs.oracle.com/javase/tutorial/2d/images/loadimage.html) and [Writing/Saving an Image](http://docs.oracle.com/javase/tutorial/2d/images/saveimage.html)). Use some kind of `ExecutorService`, probably a fixed pool service, load all the entries from the text file, add a "task" to the `ExecutorService` for each entry. Run until it's all done (maybe using something like `invokeAll`) – MadProgrammer Jun 21 '15 at 05:45

3 Answers3

2

You can take advantage of pre-existing APIs...

  • Use Files.readAllLines to read the file
  • ImageIO.read and ImageIO.write to download the file
  • The Executor API to run concurrent tasks to help make it quicker

So, basically, downloading the image from each URL is the same process, which can encapsulated into a simple task.

public class DownloadImageFromURLTask implements Callable<File> {

    private URL url;
    private String path;

    public DownloadImageFromURLTask(URL url, String path) {
        this.url = url;
        this.path = path;
    }

    @Override
    public File call() throws Exception {

        BufferedImage img = ImageIO.read(url);
        String name = url.getPath();
        name = name.substring(name.lastIndexOf("/"));
        File output = new File(path, name);
        ImageIO.write(img, "jpg", output);

        return output;
    }

}

I've used Callable here, because it will plugin into the Executor API and allow me to get the return result, which is the File where the image was downloaded.

So next, we need to read the URLs from the text file and build a list of tasks to executed...

        List<String> listOfURLs = Files.readAllLines(new File("ListOfURLs.txt").toPath());
        List<DownloadImageFromURLTask> listOfTasks = new ArrayList<>(listOfURLs.size());
        String path = "/home/abc";
        for (String url : listOfURLs) {
            listOfTasks.add(new DownloadImageFromURLTask(new URL(url), path));
        }

For simplicity, I've just used Files.readAllLines

Next, we need to execute all the tasks...

        ExecutorService exector = Executors.newFixedThreadPool(4);
        List<Future<File>> listOfFutures = exector.invokeAll(listOfTasks);

This is uses a fixed sized thread pool, this allows for some performance adjusting on our part, as each task will be pooled until a thread becomes available to run it.

The use of invokeAll here is a blocking call, meaning that until all the tasks have either completed or failed, it won't return. Convenient.

Optionally, you can process the resulting List of Future's, these carry the return results of the Callables

        for (int index = 0; index < listOfFutures.size(); index++) {
            Future<File> future = listOfFutures.get(index);
            try {
                File file = future.get();
            } catch (ExecutionException ex) {
                String url = listOfURLs.get(index);
                System.out.println("Failed to download image from " + url);
                ex.printStackTrace();
            }
        }

In this example, it's processing the list looking for failed tasks.

Have a look at Reading/Loading an Image, Writing/Saving an Image, Executors and Reading, Writing, and Creating Files for more details

MadProgrammer
  • 343,457
  • 22
  • 230
  • 366
0

You can use curl on the command line to fetch multiple files. However, I assume you are trying to learn Java concurrency, so I redid your program using concurrency:

Update: This now processes the file and downloads each URL and keeps the original filename

/**
 * Created by justin on 6/20/15.
 */

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.lang.System;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.io.File;
import java.nio.file.Files;

class ImageSaver implements Runnable {

    private final String imageUrl;
    private final String destinationFile;
    private Exception exception;

    public ImageSaver(String imageUrl, String destinationFile) {
        this.imageUrl = imageUrl;
        this.destinationFile = destinationFile;
        this.exception = null;
    }

    public boolean isFail() {
        return exception != null;
    }

    public String toString() {
        return this.imageUrl + " -> " + this.destinationFile + " Error: " + exception.toString();
    }

    @Override
    public void run() {
        try {
            URL url = null;
            url = new URL(imageUrl);
            System.out.println("Getting " + imageUrl);
            url.openConnection();
            InputStream is = url.openStream();
            System.out.println("Opening " + imageUrl);
            OutputStream os = new FileOutputStream(destinationFile);

            byte[] b = new byte[2048];
            int length;

            while ((length = is.read(b)) != -1) {
                os.write(b, 0, length);
            }

            is.close();
            os.close();
            System.out.println("Finished getting " + imageUrl);
        } catch (Exception e) {
            exception = e;
        }

    }

}


public class SaveImageFromUrl {
    public static void main(String[] args) throws Exception {
        if (args.length < 2) {
            System.out.println("Usage: <file with urls on each line> <dest path>");
            return;
        }
        String destPath = args[1];
        List<String> listOfURLs = Files.readAllLines(new File(args[0]).toPath());
        ExecutorService executor = Executors.newFixedThreadPool(5);
        List<ImageSaver> save = new ArrayList<ImageSaver>();

        for (String path : listOfURLs) {

            String fn = new File(path).getName();
            ImageSaver worker = new ImageSaver(path, destPath + fn);
            save.add(worker);
            executor.execute(worker);
        }
        executor.shutdown();
        while (!executor.isTerminated()) {
            Thread.yield();
        }
        for (ImageSaver s : save) {
            if (s.isFail()) {
                System.out.println("Failed to download " + s);
            }
        }
        executor.shutdown();
        System.out.println("All Done");
    }

}
Justin S.
  • 1
  • 1
  • Thanks for your suggestion , i tried Wget -i -o log.txt to extract the images from the text files of url . but it is too slow and i actually wanted to learn Java multi threading. – irfana karim Jun 21 '15 at 07:09
  • sorry for not mentioning that the text file is inside my local folder and it contains 100 image-links. i need to get all the images from that file. the code above cannot open an text file. – irfana karim Jun 21 '15 at 07:42
  • I updated my solution so it does what you asked for now. The first argument is a file with urls. The second argument is a directory you want to copy things into. – Justin S. Jun 21 '15 at 08:57
0

To get image from url text you can use below code:

public class SaveImageFromUrl {
    BufferedImage img = null;

    public static void main(String[] args)  {
        String path = "https://upload.wikimedia.org/wikipedia/commons/1/1e/Stonehenge.jpg";
        String destinationFile = "C:\\Users\\user\\Desktop";

        try {
            BufferedImage tmp = ImageIO.read(new URL(path));
            ImageIO.write(tmp, "jpg", new File(destinationFile + "\\" + "image" + ".jpg"));
        } catch (Exception ex) {
            System.out.println("Exception ex  ///" + ex);
        }
    }
}
Tom
  • 16,842
  • 17
  • 45
  • 54
Rafiq
  • 740
  • 1
  • 5
  • 17