I have coded my own ARGB image overlay code and it is as optimized to be as fast as "possible". However it isn't as fast as I would like it to be. Also, this process is called multiple times and uses a lot of the CPU.
I want to use the GPU to do the image overlay. I think this could potentially provide a huge boost in speed along with minimal CPU usage.
How can I go about using the GPU to overlay the images? From my research I can use a frame buffer to draw off screen. However, I'm not 100% sure how to go about setting that up. Essentially I'm passing a Bitmap from java down to JNI and then getting the Bitmap pixels where I would need to overlay the images into. The amount of images to be overlayed can range from 2 up to 12. These are large images.
UPDATE1:
So far this is what I have. I think is somewhat a step in the right direction.
GLuint arraybuffer;
glGenBuffers(GL_ARRAY_BUFFER, &arraybuffer);
glBindBuffer(GL_ARRAY_BUFFER, arraybuffer);
// draw into the array buffer the images
...
// read the buffer pixels from OpenGL
glReadPixels(0, 0, outputWidth, outputHeight, GL_RGBA, GL_UNSIGNED_BYTE, pixels);