I'm writing an effect filter for Android devices, which has two-dimension loops in the fragment shader. For most of the devices, the shader can be compiled and run in reasonable time, but some of the devices takes several minutes to compile the shader at the first time.
My fragment shader has a heavy two-dimension kernel convolution:
const lowp int KERNEL_RADIUS = 19;
....
for (int y = -KERNEL_RADIUS; y <= KERNEL_RADIUS; y++)
{
for (int x = -KERNEL_RADIUS; x <= KERNEL_RADIUS; x++)
{
....
}
}
In fact it is a 39x39 loop, and it cannot be split into two passes of one-dimension filter due to the kernel design. The kernel weights are stored as another input texture of the shader for lookup. Obviously this shader cannot have reasonable performance when directly applied to an image with normal size (800x600 ~ 1600x1200), so I resize the image to 200x200 ~ 400x400 and then I can have real-time response on most devices.
I know that some shader compiler cannot accept such a large loop and will fail to compile the program. I have found some devices with this behavior. The compile time is still reasonable on the device. It just reports a failure and let me disable the effect filter. However, on some other devices, the compilation is successful and the program can be used normally, but the first time of compilation is about 2~3 minutes. After that, the compiler caches the program and give a compile time of 50~100 ms when I create the effect filter again.
Currently I cannot modify my algorithm to remove or shrink the two-dimension loops, but it is also hilarious if I let the user to wait minutes for the first launch. I want to disable the effect filter on those devices. The problem is that I use GLES20.glCompileShader() to compile the shader:
public static int loadShader(final String strSource, final int iType)
{
int[] compiled = new int[1];
int iShader = GLES20.glCreateShader(iType);
GLES20.glShaderSource(iShader, strSource);
GLES20.glCompileShader(iShader);
GLES20.glGetShaderiv(iShader, GLES20.GL_COMPILE_STATUS, compiled, 0);
if (compiled[0] == 0) {
Log.d("Load Shader Failed", "Compilation\n" + GLES20.glGetShaderInfoLog(iShader));
return 0;
}
return iShader;
}
It is a blocking call. I need to wait several minutes before making the decision of disabling filter on such devices.
Is there a way to compile the shader codes asynchronously or in a limited duration? (For example, return fail in 5 seconds if the compliation does not complete yet.)
If glCompileShader() can only be called synchronously, I want to force terminate the thread so it does not block the AP. But it will cause a serious problem. The thread which compiles the shader code is the same as the thread creating the OpenGL context. If I kill the thread while it is blocking, I cannot destroy the OpenGL context appropriately in the same thread.
Is it possible, or safe, to compile the shader codes in a different thread than the one which initializes the OpenGL context? I was told that they should be the same thread, and I want to know if they are able to be different when I really need to do this.