Unacceptable Performance Reading Transparent .png Pixel by Pixel

Question

I'm creating a tool that detects sprites in a sprite sheet and converts each found sprite into a new BufferedImage. This process works, but is prohibitively slow with certain image formats- mostly transparent images- such as this one:

(Kenney's Game Assets - Animal Pack)

I have profiled my code and determined that the vast majority, by over 99% of my app time is spent in this method alone because of the many getRGB() calls.

private fun findContiguousSprite(image: BufferedImage, startingPoint: Point, backgroundColor: Color): List<Point> {
    val unvisited = LinkedList<Point>()
    val visited   = arrayListOf(startingPoint)

    unvisited.addAll(neighbors(startingPoint, image).filter {
        Color(image.getRGB(it.x, it.y)) != backgroundColor
    })

    while (unvisited.isNotEmpty()) {
        val currentPoint = unvisited.pop()
        val currentColor = Color(image.getRGB(currentPoint.x, currentPoint.y))

        if (currentColor != backgroundColor) {
            unvisited.addAll(neighbors(currentPoint, image).filter {
                !visited.contains(it) &&
                !unvisited.contains(it) &&
                (Color(image.getRGB(it.x, it.y)) != backgroundColor)
            })

            visited.add(currentPoint)
        }
    }

    return visited.distinct()
}

I have attempted to optimize the process of extracting rgb colors as seen in the question Java - get pixel array from image by accessing the image's raster data buffer, but this fails in the newest versions of Java with a java.lang.ClassCastException: java.awt.image.DataBufferInt cannot be cast to java.awt.image.DataBufferByte.

Other stumbling blocks include the deceitfully needless boxing of colors such as in the line Color(image.getRGB(it.x, it.y)) != backgroundColor. However, while image.getRGB() returns by default in the RGBA color space, background.rgb only returns the sRGB color space.

Question: How can I improve the performance of reading a BufferedImage especially in the case of transparent images? Why is it so fast with almost any other .png image I throw at it except for these?

Note: While the code is in Kotlin, I accept Java or any other JVM language as an answer.

Code dump: In case you want the full section of code:

private fun findSpriteDimensions(image: BufferedImage,
                                 backgroundColor: Color): List<Rectangle> {
    val workingImage = image.copy()

    val spriteDimensions = ArrayList<Rectangle>()
    for (pixel in workingImage) {
        val (point, color) = pixel

        if (color != backgroundColor) {
            logger.debug("Found a sprite starting at (${point.x}, ${point.y})")
            val spritePlot = findContiguousSprite(workingImage, point, backgroundColor)
            val spriteRectangle = spanRectangleFrom(spritePlot)

            logger.debug("The identified sprite has an area of ${spriteRectangle.width}x${spriteRectangle.height}")

            spriteDimensions.add(spriteRectangle)
            workingImage.erasePoints(spritePlot, backgroundColor)
        }
    }

    return spriteDimensions
}

private fun findContiguousSprite(image: BufferedImage, startingPoint: Point, backgroundColor: Color): List<Point> {
    val unvisited = LinkedList<Point>()
    val visited   = arrayListOf(startingPoint)

    unvisited.addAll(neighbors(startingPoint, image).filter {
        Color(image.getRGB(it.x, it.y)) != backgroundColor
    })

    while (unvisited.isNotEmpty()) {
        val currentPoint = unvisited.pop()
        val currentColor = Color(image.getRGB(currentPoint.x, currentPoint.y))

        if (currentColor != backgroundColor) {
            unvisited.addAll(neighbors(currentPoint, image).filter {
                !visited.contains(it) &&
                !unvisited.contains(it) &&
                (Color(image.getRGB(it.x, it.y)) != backgroundColor)
            })

            visited.add(currentPoint)
        }
    }

    return visited.distinct()
}

private fun neighbors(point: Point, image: Image): List<Point> {
    val points = ArrayList<Point>()
    val imageWidth = image.getWidth(null) - 1
    val imageHeight = image.getHeight(null) - 1

    // Left neighbor
    if (point.x > 0)
        points.add(Point(point.x - 1, point.y))

    // Right neighbor
    if (point.x < imageWidth)
        points.add(Point(point.x + 1, point.y))

    // Top neighbor
    if (point.y > 0)
        points.add(Point(point.x, point.y - 1))

    // Bottom neighbor
    if (point.y < imageHeight)
        points.add(Point(point.x, point.y + 1))

    // Top-left neighbor
    if (point.x > 0 && point.y > 0)
        points.add(Point(point.x - 1, point.y - 1))

    // Top-right neighbor
    if (point.x < imageWidth && point.y > 0)
        points.add(Point(point.x + 1, point.y - 1))

    // Bottom-left neighbor
    if (point.x > 0 && point.y < imageHeight - 1)
        points.add(Point(point.x - 1, point.y + 1))

    // Bottom-right neighbor
    if (point.x < imageWidth && point.y < imageHeight)
        points.add(Point(point.x + 1, point.y + 1))

    return points
}

First Optimization

Driven by @Durandal's comment, I decided to change my ArrayList to a HashSet. I also found a way to preserve alpha values using an alternative constructor for Color, Color(rgb, preserveAlpha). Now I no longer need to box getRGB() before comparing the two values.

private fun findContiguousSprite(image: BufferedImage, point: Point, backgroundColor: Color): List<Point> {
    val unvisited = LinkedList<Point>()
    val visited   = hashSetOf(point)

    unvisited.addAll(neighbors(point, image).filter { image.getRGB(it.x, it.y) != backgroundColor.rgb })

    while (unvisited.isNotEmpty()) {
        val currentPoint = unvisited.pop()
        val currentColor = image.getRGB(currentPoint.x, currentPoint.y)

        if (currentColor != backgroundColor.rgb) {
            unvisited.addAll(neighbors(currentPoint, image).filter {
                !visited.contains(it) &&
                !unvisited.contains(it) &&
                image.getRGB(it.x, it.y) != backgroundColor.rgb
            })

            visited.add(currentPoint)
        }
    }

    return visited.toList()
}

This processed the above image in 319ms. Awesome!

Without having tested anything my guess is that the *major* performance bottleneck is the use of inappropiate collection types, LinkedList and ArrayList in combination with the use of contains() is always a warning flag. HashSet is probably a more appropiate type for this type of work. — Durandal, May 25 '16 at 16:02
Hey Durandal, thanks for the suggestion! I implemented it in code and it have a tremendous speedup- see the Optimization edit. You should get credit for your comment, why don't you post it as answer and gain some upvotes? — IAE, May 25 '16 at 17:37
I'm not really worried about reputation but some explanatory reasoning might be useful to anyone stumbling here, so I'll write it as an answer. — Durandal, May 26 '16 at 17:28
You wrote: I have attempted to optimize the process of extracting rgb colors as seen in the question Java - get pixel array from image by accessing the image's raster data buffer, but this fails in the newest versions of Java with a java.lang.ClassCastException: java.awt.image.DataBufferInt cannot be cast to java.awt.image.DataBufferByte. -- Just create a new BufferedImage with the correct format, - and draw the previous one into the new one. Then you can use your preferred DataBuffer format. Even if this takes some ms, I'm sure total time is faster. — Terje, May 27 '16 at 15:03
Hey Terje, thanks for the input! Unfortunately, I'm not sure what you mean with the "right" format. Upon reading I automatically convert every BufferedImage to type `ARGB`. I need this because the extracted sprites should have a transparent background. — IAE, May 27 '16 at 17:49

score 1 · Answer 1 · answered May 26 '16 at 17:25

Using contains on an ArrayList or LinkedList has complexity O(n). This can quickly turn into a large overhead, considering you are doing many contains() calls in your filter. The complexity of the lookup visited.contains() grows with the size of the image processed (more pixels to visit and more pixels visited, turning complexity into O(n^2)).

The simplest way to reduce this cost is to use a collection type that has a fast contains; like O(1) in the case of HashSet. The Set semantics also fit your requirements a little better than lists, as I understand it the visited/unvisted collections should not allow duplicates. Since Sets do not allow duplicates by contract, some of the explicit contains checks may be eliminated; in places you want to respond to the event of the point being first added you can also make use of the boolean result the add()-method gives (true when element was not present and added).

While boxing/unboxing does cost some time, its a linear cost. Reducing boxing overhead will require quite a few changes to the code and "only" get you a constant factor speedup. Standard collections do not work with primitives (without the autoboxing you want to avoid in the first place). While there are special purpose 3rd party collections (Trove comes to mind) that handle primitives without boxing. I would only go this way if absolutely necessary, changing to primitive types where possible would most likely make your code longer and somewhat more obfuscated.

Unacceptable Performance Reading Transparent .png Pixel by Pixel

First Optimization

1 Answers1