2

I have to fill a BufferedImage with the content of a custom Image, because I want to display my custom image in a JFrame.

I used a simple for-Loop before I checked my code with a Profiler:

for(int x = 0; x < width, x++)
    for(int y = 0; y < height; y++)
        bufferedImage.setRGB(x, height-y-1, toIntColor( customImage.get(x,y) ));

That worked but I decided to try it concurrent. This Code divides the Image into columns, and should copy each column parallel (code snippet simplified):

final ExecutorService pool = Executors.newCachedThreadPool();
final int columns = Runtime.getRuntime().availableProcessors() +1;
final int columnWidth = getWidth() / columns;

for(int column = 0; column < columns; column++){
        final Rectangle tile = Rectangle.bottomLeftRightTop(
                0,
                columnWidth*column,
                columnWidth*(column+1),
                height
        );

        pool.execute(new ImageConverter(tile));
}


pool.shutdown();
pool.awaitTermination( timeoutSeconds, TimeUnit.SECONDS);

ImageConverterRunnable:

private final class ImageConverter implements Runnable {
    private final Rectangle tile;
    protected ImageConverter(Rectangle tile){ this.tile = tile;  }

    @Override public void run(){
        for(int x = tile.left; x < tile.right; x++)
            for(int y = tile.bottom; y < tile.top; y++)
                bufferedImage.setRGB(x, height-y-1, toIntColor( customImage.get(x,y) ));        }
}

I noticed that the concurrent solution took about two to three times longer than the simple for-Loop. I already looked for questions like this and googled, but I didn't find anything.

Why does it take so Long? It it because the awaitTermination() line? Is there maybe a better solution for converting Images?

Thanks in advance, Johannes :)

EDIT:

I have performed some testing. All conversions measured were preceeded by a warmup of 3000 Image conversions.

the simple for-Loop takes 7 to 8 milliseconds to copy the Bitmap.

The parallel Image copying took 20 to 24 milliseconds per image. without warm-up it took 60 milliseconds.

Johannes
  • 987
  • 10
  • 15
  • 1
    Iterating across your pixels the other way around (filling in rows rather than columns) may provide better locality and therefore could be faster. There could be block copying operations too that are a lot faster than any pixel-by-pixel solution. – biziclop Aug 13 '15 at 13:34
  • Also, have one Executor available all the time and use a CountDownLatch to wait for completion of each image transfer – forty-two Aug 13 '15 at 13:42
  • See this related [example](http://stackoverflow.com/a/25043676/230513) for correct synchronization. – trashgod Aug 13 '15 at 15:52
  • Thank you, i already tried keeping an Executor but i didn't know about CountDownLatch. I will Change that. @biziclop i will switch the nested for loops. BTW, do you know the reason for why rows may be faster than columns? – Johannes Aug 13 '15 at 16:21
  • @Johannes Because normally images (and `BufferedImage` in particular) are stored in that order, row by row. So if you access the data in the same order, it's more likely to already be in a cache than if you hop across them in a different order. – biziclop Aug 13 '15 at 16:46
  • After testing, I feel like the things you said are only small optimizations. I'm certain there's a bug in my code which slows conversion down like 300% or so – Johannes Aug 13 '15 at 16:50
  • @biziclop My custom Image is the other way round, so i assume the Speed gained by writing the bufferedimage will be lost in reading my custom Image. I'll try that out anyways – Johannes Aug 13 '15 at 17:00
  • because i draw the bufferedImage with a canvas inside my JFrame, i considered drawing directly to the canvas from my custom Image. I'll have to measure that soon – Johannes Aug 13 '15 at 17:02
  • @Johannes Yeah, that could be a problem. Even without that it isn't guaranteed to work, I could get about 10% of difference on my machine (even when filling the image with a constant colour). – biziclop Aug 13 '15 at 17:05

1 Answers1

2

Using threads does not speed up execution (common misconception). My assumption is the overhead of using threads is causing your program to run slower. Context switches are very expensive.

In general, threads are useful when something blocks. I do not see any blocking code in what you provide.

user489041
  • 27,916
  • 55
  • 135
  • 204
  • Thank you :) So you suggest just sticking to the for Loop instead of trying to optimize the concurrent design? The conversion between the Images is the only bottleneck in my Code, so I really want to make it better. – Johannes Aug 13 '15 at 13:33
  • 2
    Of course threads can speed up execution: If the problem can be parallellized, is big enough and you have available cores – forty-two Aug 13 '15 at 13:39
  • I use threading in that very same project, and it is always a performance gain, but apparently not at the Image conversion method. I think Image copying can be parrallellized, but theres a bug hiding in my code – Johannes Aug 13 '15 at 13:47