Writing a functional and yet functional image processing library in Scala

Question

We are developing a small image processing library for Scala (student project). The library is completely functional (i.e. no mutability). The raster of image is stored as Stream[Stream[Int]] to exploit the benefits of lazy evaluation with least efforts. However upon performing a few operations on an image the heap gets full and an OutOfMemoryError is thrown. (for example, up to 4 operations can be performed on a jpeg image sized 500 x 400, 35 kb before JVM heap runs out of space.)

The approaches we have thought of are:

Twiddling with JVM options and increase the heap size. (We don't know how to do this under IDEA - the IDE we are working with.)
Choosing a different data structure than Stream[Stream[Int]], the one which is more suited to the task of image processing. (Again we do not have much idea about the functional data structures beyond the simple List and Stream.)

The last option we have is giving up on immutability and making it a mutable library (like the popular image processing libraries), which we don't really want to do. Please suggest us some way to keep this library functional and still functional, if you know what I mean.

Thank you,
Siddharth Raina.

ADDENDUM:
For an image sized 1024 x 768, the JVM runs out of heap space even for a single mapping operation. Some example code from our test:

val image = Image from "E:/metallica.jpg"
val redded = image.map(_ & 0xff0000)
redded.display(title = "Redded")

And the output:

"C:\Program Files (x86)\Java\jdk1.6.0_02\bin\java" -Didea.launcher.port=7533 "-Didea.launcher.bin.path=C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 10.0.2\bin" -Dfile.encoding=windows-1252 -classpath "C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\charsets.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\deploy.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\javaws.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\jce.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\jsse.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\management-agent.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\plugin.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\resources.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\rt.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\ext\dnsns.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\ext\localedata.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\ext\sunjce_provider.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\ext\sunmscapi.jar;C:\Program Files (x86)\Java\jdk1.6.0_02\jre\lib\ext\sunpkcs11.jar;C:\new Ph\Phoebe\out\production\Phoebe;E:\Inventory\Marvin.jar;C:\scala-2.8.1.final\lib\scala-library.jar;C:\scala-2.8.1.final\lib\scala-swing.jar;C:\scala-2.8.1.final\lib\scala-dbc.jar;C:\new Ph;C:\scala-2.8.1.final\lib\scala-compiler.jar;E:\Inventory\commons-math-2.2.jar;E:\Inventory\commons-math-2.2-sources.jar;E:\Inventory\commons-math-2.2-javadoc.jar;E:\Inventory\jmathplot.jar;E:\Inventory\jmathio.jar;E:\Inventory\jmatharray.jar;E:\Inventory\Javax Media.zip;E:\Inventory\jai-core-1.1.3-alpha.jar;C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 10.0.2\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain phoebe.test.ImageTest
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at scala.collection.Iterator$class.toStream(Iterator.scala:1011)
    at scala.collection.IndexedSeqLike$Elements.toStream(IndexedSeqLike.scala:52)
    at scala.collection.Iterator$$anonfun$toStream$1.apply(Iterator.scala:1011)
    at scala.collection.Iterator$$anonfun$toStream$1.apply(Iterator.scala:1011)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:565)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:557)
    at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:168)
    at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:168)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:565)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:557)
    at scala.collection.immutable.Stream$$anonfun$flatten1$1$1.apply(Stream.scala:453)
    at scala.collection.immutable.Stream$$anonfun$flatten1$1$1.apply(Stream.scala:453)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:565)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:557)
    at scala.collection.immutable.Stream.length(Stream.scala:113)
    at scala.collection.SeqLike$class.size(SeqLike.scala:221)
    at scala.collection.immutable.Stream.size(Stream.scala:48)
    at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:388)
    at scala.collection.immutable.Stream.toArray(Stream.scala:48)
    at phoebe.picasso.Image.force(Image.scala:85)
    at phoebe.picasso.SimpleImageViewer.<init>(SimpleImageViewer.scala:10)
    at phoebe.picasso.Image.display(Image.scala:91)
    at phoebe.test.ImageTest$.main(ImageTest.scala:14)
    at phoebe.test.ImageTest.main(ImageTest.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:115)

Process finished with exit code 1

@FUZxxl: Probably on the assumption that Haskell programmers would be familiar with the performance issues of immutable lazy lists. I suspect this is a case where pervasive purity and laziness just let Haskell compilers work lots of optimization magic that others can't, though. — C. A. McCann, Mar 16 '11 at 12:42
It's a pity you are not doing this in Haskell, as Data Parallel Haskell was just made for this. — Paul Johnson, Mar 16 '11 at 19:33
I've recently joined a fairly serious image processing project, and found out that the primitive operations (i.e. basic filtering, morphology, edges...) must be done at the absolute machine level: inline assembly is your best friend. Though functional abstractions are insanely good, they surely must be built on top of ultrafast low-level image arithmetic. — ansgri, Mar 19 '11 at 22:26
Image processing using Haskell parallel arrays (Repa) is becoming more common (at least two projects). They take advantage of fusible array operations made possible by GHC's guarantee of purity. — Don Stewart, Mar 19 '11 at 22:54

score 19 · Answer 1 · edited May 23 '17 at 12:18

If I understood correctly, you store each individual pixel in one Stream element, and this can be inefficient. What you can do is create your custom LazyRaster class which contains lazy references to blocks of the image of some size (for instance, 20x20). The first time some block is written, its corresponding array is initialized, and from there on changing a pixel means writing to that array.

This is more work, but may result in better performance. Furthermore, if you wish to support stacking of image operations (e.g. do a map - take - map), and then evaluating the image in "one-go", the implementation could get tricky - stream implementation is the best evidence for this.

Another thing one can do is ensure that the old Streams are being properly garbage collected. I suspect image object in your example is a wrapper for your streams. If you wish to stack multiple image operations (like mapping) together and be able to gc the references you no longer need, you have to make sure that you don't hold any references to a stream - note that this is not ensured if:

you have a reference to your image on the stack (image in the example)
your Image wrapper contains such a reference.

Without knowing more about the exact use cases, its hard to say more.

Personally, I would avoid Streams altogether, and simply use some immutable array-based data structure which is both space-efficient and avoids boxing. The only place where I potentially see Streams being used is in iterative image transformations, like convolution or applying a stack of filters. You wouldn't have a Stream of pixels, but a Stream of images, instead. This could be a nice way to express a sequence of transformations - in this case, the comments about gc in the link given above apply.

Much of that block-handling logic is already available via `Vector`, which is internally implemented as a Trie. Despite being mutable, this gives it a nice performance profile for using immutably. Despite that, the cost of boxing is still going to be a significant pain point if you're after speed. — Kevin Wright, Mar 16 '11 at 14:22
True, one might use `Vector`s here. Boxing is an issue - this is why a custom class should be either hardcoded for `Int`s or specialized to avoid boxing. Since they only need `Int`s, I would go for the former. — axel22, Mar 16 '11 at 14:28

score 12 · Answer 2 · edited May 23 '17 at 09:58

12

If you process large streams, you need to avoid holding onto a reference to the head of the stream. This will prevent garbage collection.

It's possible that calling certain methods on Stream will internally hold onto the head. See the discussion here: Functional processing of Scala streams without OutOfMemory errors

edited May 23 '17 at 09:58

Community

1
1

answered Mar 16 '11 at 10:05

retronym

54,768
12
155
168

score 7 · Answer 3 · answered Mar 16 '11 at 10:55

7

Stream is very unlikely to be the optimum structure here. Given the nature of a JPEG it makes little sense to "stream" it into memory line-by-line.

Stream also has linear access time for reading elements. Again, probably not what you want unless you're streaming data.

I'd recommend using an IndexedSeq[IndexedSeq[Int]] in this scenario. Or (if performance is important) an Array[Array[Int]], which will allow you to avoid some boxing/unboxing costs.

Martin has written a good overview of the 2.8 collections API which should help you understand the inherent trade-offs in the various collection types available.

Even if using Arrays, there's still every reason to use them as immutable structures and maintain a functional programming style. Just because a structure is mutable doesn't mean you have to mutate it!

answered Mar 16 '11 at 10:55

Kevin Wright

49,540
9
105
155

1

I think it makes sense to talk about specific data structures here. I pondered about Vector, but it doesn't have the benefits of laziness. I think hand-rolling their own lazy data structure will be the way to go. – Daniel C. Sobral Mar 16 '11 at 13:14
1

Depends on the goal. If performance is key, then some form of immutable wrapper over the Java2D API is the obvious solution, in no small part thanks to hardware acceleration. As an academic exercise, then I reckon you're right about a hand-rolled structure. – Kevin Wright Mar 16 '11 at 14:19
I mean performance-wise. Depending on what will be done with it, laziness may save you from computing stuff that will never be used. – Daniel C. Sobral Mar 16 '11 at 18:44
@Daniel if building a set of transformations on an image, sure! Though hardware can often do this kind of thing with layers and alpha channels. I was more bothered about the idea of lazily decoding a JPEG file, the format doesn't seem amenable to such tricks. – Kevin Wright Mar 17 '11 at 11:08
Mmmm, true enough. I hadn't consider their specific use case. – Daniel C. Sobral Mar 17 '11 at 14:15

score 6 · Answer 4 · answered Mar 16 '11 at 19:22

6

I recommend also looking at continuous rather than just discrete models for imagery. Continuous is generally more modular/composable than discrete--whether time or space.

answered Mar 16 '11 at 19:22

Conal

18,517
2
37
40

7

For continuous images, check out the paper *[Functional Images](http://conal.net/papers/functional-images/)*, in which images are continuous and infinite. Both aspects make the semantics very simple and hence composable. Pleasant properties abound, such as scaling down & up or rotating left & right being exact inverses. Discrete and/or finite makes the semantics complex and these laws fail. Compilation generates very fast code. See *[Compiling Embedded Languages](http://conal.net/papers/jfp-saig/)*. Similarly for time as in functional reactive animation/programming. – Conal Mar 17 '11 at 02:10

score 5 · Answer 5 · answered Mar 16 '11 at 11:03

As a first step you should take a memory dump and analyze it. It is very possible that you will see the problem immediately.

There is special command line option to force JVM to make dump on OOME: -XX:+HeapDumpOnOutOfMemoryError. And good tools, like jhat and VisualVM, which can help you in analysis.

Randall Schulz · Answer 6 · 2011-03-16T14:13:59.153

Stream is more about lazy evaluation than immutability. And you're forcing an insane amount of space and time overhead for each pixel by doing so. Furthermore, Streams only make sense when you can defer the determination (calculation or retrieval) of individual pixel values. And, of course, random access is impossible. I'd have to deem the Stream an entirely inappropriate data structure for image processing.

I'd strongly recommend that you manage your own raster memory (bonus points for not fixing a single raster image organization into your code) and allocate storage for whole channels or planes or bands thereof (depending on the raster organization in play).

UPDATE: By the foregoing, I mean don't use nested Array or IndexedSeq, but allocate a block and compute which element using the row and column values.

Then take an "immutable after initialization" approach. Once a given pixel or sample has been established in the raster, you never allow it to be changed. This might require a one-bit raster plane to track the established pixels. Alternatively, if you know how you'll be filling the raster (the sequence in which pixels will be assigned) you can get away with a much simpler and cheaper representation of how much of the raster is established and how much remains to be filled.

Then as you perform processing on the raster images, do so in a pipeline where no image is altered in place, but rather a new image is always generated as various transforms are applied.

You might consider that for some image transformations (convolution, e.g.) you must take this approach or you will not get the correct results.

score 4 · Answer 7 · answered Mar 16 '11 at 13:16

4

I strongly recommend Okasaki's Purely Functional Data Structures if you don't have any experience with functional data structures (as you seem to indicate).

answered Mar 16 '11 at 13:16

Daniel C. Sobral

295,120
86
501
681

score 2 · Answer 8 · answered Mar 16 '11 at 10:49

To increase your heap size using intellij, you need to add the following to the VM Parameters section of the Run/Debug Configuration:

-Xms256m -Xmx256m

This will increase the maximum heap size to 256MB and also ensure this amount is requested by the VM at startup, which generally represents a performance increase.

Additionally, you're using a relatively old JDK. If possible, I recommend you update to the latest available version, as newer builds enable escape analysis, which can in some cases have a dramatic effect on performance.

Now, in terms of algorithms, I would suggest that you follow the advice above and divide the image into blocks of say, 9x9 (any size will do though). I'd then go and have a look at Huet's Zipper and think about how that might be applied to an image represented as a tree structure, and how that might enable you to model the image as a persistent data structure.

score 1 · Answer 9 · edited May 23 '17 at 12:34

1

Increasing the heap size in idea can be done in the vmoptions file, which can be found in the bin directory in your idea installation directory (add -Xmx512m to set the heap size to 512 megabyte, for example). Apart from that, it is hard to say what causes the out of memory without knowing what operations you exactly perform, but perhaps this question provides some useful tips.

edited May 23 '17 at 12:34

Community

1
1

answered Mar 16 '11 at 09:54

Arjan Blokzijl

6,878
2
31
27

1

This changes the VMOptions for intellij itself, not the vm his program is running on. – AlecZorab Mar 16 '11 at 10:39

score 1 · Answer 10 · answered Mar 16 '11 at 13:45

One solution would be to put the image in an array, and make filters like "map" return a wrapper for that array. Basically, you have a trait named Image. That trait requires abstract pixel retrieving operations. When, for example, the "map" function is called, you return an implementation, which delegates the calls to the old Image, and executes the function on it. The only problem with that would be that the transformation could end up being executed multiple times, but since it is a functional library, that is not very important.

Writing a functional and yet functional image processing library in Scala

10 Answers10