I've written an OpenCL-kernel (JOCL) which takes a few hundred megabytes of input data and produces a viewport-friendly RGB-Image as result. This image shall be presented in an SWT-GUI with a refresh rate of about at least 10 fps to allow realtime interaction (panning, zooming, gamma, etc.).
I came up with a solution that works but doesn't feel optimal:
- create a
byte[]
-array in Java as frame buffer (ABGR) - wrap the
byte[]
-array by a CL-buffer - create an
ImageData
to wrap thebyte[]
-array - pass the CL-buffer to the OpenCL-kernel as argument
- for each frame update, repeat:
- call the OpenCL-kernel to populate the
byte[]
-array - create a new
Image
based on theImageData
-object - draw the image to some
Canvas
Canvas.redraw()
- call the OpenCL-kernel to populate the
What I don't like about this solution is:
- I assume the
byte[]
-array will be copied and converted to the target's layout in order to draw the image within the canvas. Both could be omitted if I knew the memory location and layout of the frame buffer. - I believe the
Canvas
' internal frame buffer resides in the host's memory. So after all it has to be copied back into the graphics card's memory. - I'd like to use
ByteBuffer.allocateDirect()
rather than allocating thebyte[]
-array within the JVM's memory, but I can't construct anImageData
-object based thereon, as a backing array is optional. (Ok, I might use a try-or-fallback-approach.) ImageData
accepts a maximum of 8 bits per channel (I didn't find this explicitly mentioned in the documentation, but in the source code). I'd like to produce 10 bits per channel (10:10:10) as result for machines that are capable of handling them.
How can I reduce the number of memory copies and conversions?
I'm not afraid of implementing several platform-/device-specific memory layouts within the kernel. I still have my currently working implementation as fallback for some weird cases and less common platforms.
I know there's a GLCanvas
that can be used for a more direct access for implementations using OpenGL (like JOGL), but I don't like the idea of depending on and bundling OpenGL with my project, just to accelerate the framebuffer writes.
Update: I think I have to use OpenGL to glue those parts together: the garbage collector kills my byte[]
-array every now and then, which results in a crash of the whole JVM. Yet another copy (direct buffer -> byte[]
-array) would solve this problem, which is inacceptable.
It seems to be impossible to adapt the existing implementation of GLCanvas
to communicate with OpenCL as it lacks a system to communicate with a window system. OpenGL in contrast provides such an integration.