4

I want to improve glReadPixels() performance using PBO (for GLES 3 devices) and I ran into a problem in this piece of code:

final ByteBuffer pboByteBuffer = ByteBuffer.allocateDirect(4 * mWidth * mHeight);
pboByteBuffer.order(ByteOrder.nativeOrder());

//set framebuffer to read from
GLES30.glReadBuffer(GLES30.GL_BACK);

// bind pbo
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, mPboHandleContainer[0]);

// read pixels(should be instant)
GLES30.glReadPixels(0, 0, mWidth, mHeight, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, pboByteBuffer);

// map pbo to bb
ByteBuffer byteBuffer =
        ((ByteBuffer) GLES30.glMapBufferRange(GLES30.GL_PIXEL_PACK_BUFFER, 0, 4 * mWidth * mHeight,
                                              GLES30.GL_MAP_READ_BIT)).order(ByteOrder.nativeOrder());

// unmap pbo
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);

// unbind pbo
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);

At the moment it fails glReadPixels() method. I found this & this, but I'm unable to send a zero because it takes an IntBuffer argument. I would very appreciate any suggestions about the issue

UPDATE: It seems to be impossible to only use Java API for that task. So I've used ndk to add a function that calls glReadPixels() with correct last argument(int offset) Now none of my GL calls produce an error.

That's my jni c code:

#include <jni.h>

#include <GLES3/gl3.h>

#ifdef __cplusplus
extern "C" {
    JNIEXPORT void JNICALL Java_somepackage_GLES3PBOReadPixelsFix_glReadPixelsPBO(JNIEnv * env, jobject obj, jint x, jint y, jint width, jint height, jint format, jint type, jint offsetPBO);
};
#endif

JNIEXPORT void JNICALL Java_somepackage_GLES3PBOReadPixelsFix_glReadPixelsPBO(JNIEnv * env, jobject obj, jint x, jint y, jint width, jint height, jint format, jint type, jint offsetPBO)
{
    glReadPixels(x, y, width, height, format, type, offsetPBO);
}

Now problem is that glReadPixels() call takes even more time than without PBOs so there's no performance gain yet. I'm going to explore why that happens and update when I find something.

UPDATE 2 I forgot to update it earlier, but in fact the problem was that I was using pbuffer surface that's why I had no performance gain. I compared that option and option of not using pbuffer surface and performance gain was huge.

So in case rendering offscreen and using glReadPixels it's worth using pbuffer surface

Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
Sam
  • 1,652
  • 17
  • 25
  • 1
    Weird. Looks like the Java entry point to use PBOs for `glReadPixels()` is missing. If that's indeed the case, it's not the first time this happened in Android. Using native code is always a solution. That said, if you immediately wait for the result, using PBO will probably not help you much. The whole idea is that the `glReadPixels()` call won't block. If you immediately block afterwards, that won't do much good. – Reto Koradi Feb 18 '15 at 06:27
  • @RetoKoradi Thanks for feedback! Still native code is not an option in my case. And the only thing that really matters is how much time takes my code block comparing to simple glReadPixels() call – Sam Feb 18 '15 at 13:35
  • I've encountered the same issue on Android of not seeing performance gain by using PBO. Can you kindly explain how not to use pbuffer surface on Android to enable the fast and async glReadPixels to PBO? Thanks. – Ziju Feng Nov 30 '16 at 10:17

1 Answers1

2

Mapping PBO buffer right after glReadPixels always kills performance. GPU is still working when you requested the mapping. Hence, glMapBufferRange waits gpu to complete reading pixels to the PBO. If you continue rendering after glReadPixels and will do the mapping after some frames then you will get performance gain.

More information here: http://www.songho.ca/opengl/gl_pbo.html Look "Mapping PBO" section.

ivaigult
  • 6,198
  • 5
  • 38
  • 66
  • Updated my answer. And I saw that article and many others too - unfortunately at that time I didn't realize what happened – Sam May 26 '15 at 11:50