4

I have a null pointer exception because there's some null values at the list adPics. It happens rarely. How is it possible?

(This code downloads images in parallel and saves them locally.)

List<String> downloadAdImages(List<String> imagesUrls, final String itemFolder) {
       final List adPics = new ArrayList<>();
       final ExecutorService executor = newFixedThreadPool(20);
       imagesUrls.forEach(
               picUrl -> executor.submit(() -> {
                   try {
                       String imageNewFileName = imagesUrls.indexOf(picUrl) + "." + getExtension(picUrl);
                       String bigPicUrl = picUrl.replace("b.jpg", "ab.jpg"); // big version
                       copyURLToFile(new URL(bigPicUrl), new File(itemFolder, imageNewFileName), 10, 10);
                       adPics.add(imageNewFileName);
                   } catch (IOException ex) {
                       log.log(Level.WARNING, "Could not download image {0} ({1})", new Object[]{picUrl, ex.getMessage()});
                   }
               }));
       executor.shutdown();
       try {
           executor.awaitTermination(15L, MILLISECONDS);
       } catch (InterruptedException ex) {
           log.log(Level.WARNING, "Could not wait for all images downloads");
       }
       Collections.sort(adPics); // null values at list lead to NPE here. How are there null values?
       return adPics;
   }

Stack trace## Heading ##

Sometimes adPics list has null values. That is the reason for the NPE. But how? Analysing the code executed in the thread, it can't be possible to add a null value. If there was problem downloading the image, it would thrown a IOException. imageNewFileName can't be null.

This code is Java 8 and it uses Apache Commons IO lib.

Luís Soares
  • 5,726
  • 4
  • 39
  • 66
  • 3
    (about the question, not the downvotes) I guess it's because you are adding to a list from multiple threads, but the list is not properly synchronized. Funny things can happen when you do that. – sstan Aug 11 '16 at 18:09
  • 5
    Are you sure the `awaitTermination` call didn't timeout, and you start sorting `adPics` while concurrently adding elements into it? No idea why that would throw a NPE but `ArrayList` isn't thread-safe so I guess anything can happen. – Tunaki Aug 11 '16 at 18:09
  • Hmm.. thanks @sstan I'll try using http://stackoverflow.com/questions/11360401/java-synchronized-list – Luís Soares Aug 11 '16 at 18:13
  • If it times out, I'd see the message. Still, I'll read more about that method. – Luís Soares Aug 11 '16 at 18:14
  • 1
    @LuísSoares You wouldn't. You would see the message only if the current thread is interrupted. See https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ExecutorService.html#awaitTermination-long-java.util.concurrent.TimeUnit- You don't check the return value of `awaitTermination` so you don't know if it timed out or not. – Tunaki Aug 11 '16 at 18:14
  • 1
    It would be easy to figure out with a conditional breakpoint on `== null`, wouldn't be? –  Aug 11 '16 at 18:15
  • Possible duplicate of [Is ArrayList.add(int index, E element) thread unsafe?](http://stackoverflow.com/questions/13413481/is-arraylist-addint-index-e-element-thread-unsafe) – sstan Aug 11 '16 at 18:18
  • @sstan this is not a duplicate. – Paul Wasilewski Aug 11 '16 at 18:38
  • @Paul: I don't completely disagree with you :). It may only be part of the problem. – sstan Aug 11 '16 at 18:41
  • 1
    Instead of posting images of text, please copy and paste the actual text into your question. Images cannot participate in search results, they are useless to sight impaired users, and they are considerably less legible than native browser text. – VGR Aug 11 '16 at 21:55

2 Answers2

2

The method awaitTermination doesn't stop your running threads. It only waits till all threads are completed or the timeout is reached. Therefore your threads still adding items to your list.

Also you should consider that the download and copying to file system is running even when the timeout reached.

A simple but not perfect solution would be to set a flag when the timeout is reached and check the flag before adding more items.

A better approach would be interrupting the threads after the timeout is reached. This should also include interrupting the download and file copying.

Paul Wasilewski
  • 9,762
  • 5
  • 45
  • 49
  • very interesting. what would be the cleanest way to do that? – Luís Soares Aug 11 '16 at 18:38
  • and what about people commenting about concurrent adding being the problem? Is it a mix of both problems? are they independent? – Luís Soares Aug 11 '16 at 18:39
  • 1
    If you are using a synchronized list you will somehow masquerade your problem. It should avoid the NPE. But in the background the downloading, copying and adding is still active. At the end your list would not be sorted for sure anymore, because new items could be still added after the sorting. – Paul Wasilewski Aug 11 '16 at 18:48
  • So.. what's the best way to interrupt the downloading/copying? Does that even make sense? In alternative... do you mean I should check the time passed before adding to the list? – Luís Soares Aug 11 '16 at 18:50
  • here: https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/FileUtils.html#copyURLToFile(java.net.URL,%20java.io.File,%20int,%20int) – Luís Soares Aug 11 '16 at 18:51
  • You are extending your class with FileUtils? – Paul Wasilewski Aug 11 '16 at 18:53
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/120732/discussion-between-luis-soares-and-paul-wasilewski). – Luís Soares Aug 11 '16 at 18:54
  • *If you are using a synchronized list you will somehow masquerade your problem*. I hope you're not implying that OP should not synchronize the list. OP should absolutely synchronize the list, but also fix the awaitTermination problem. – sstan Aug 11 '16 at 21:13
  • 1
    @sstan sure accessing a list from multiple threads should be synchronized :) – Paul Wasilewski Aug 11 '16 at 21:16
  • thank you! could not be better explained! Learned a lot about multi-threading now. – Luís Soares Aug 12 '16 at 00:29
0

There are several problems with your code. Paul already pointed out one problem. Other problem is that you are accessing List adPics concurrently.

To solve your problem without using synchronized lists you could make a List[CompletableFuture] and at the end call CompletableFuture.allOf. Each CompletableFuture should return image filename but before trying to download an image it should check the time to see if you want to start that operation. If not you could complete that CompletableFuture with null value. In CompletableFuture.allOf you can make a new List without null values and sort that list.

CodesInTheDark
  • 151
  • 1
  • 9
  • Using a synchronized list will not solve the problem it will only masquerade the underlying issue and leads to other issues. For example the final list is not sorted for sure. – Paul Wasilewski Aug 11 '16 at 20:04
  • I agree, that's the reason I gave an idea how to solve it without using synchronized list. – CodesInTheDark Aug 11 '16 at 21:09