Exactly as the title says.
I have a parallelized image creating/processing algorithm that I'd like to use. This is a kind of perlin noise implementation.
// Logging is never used here
#pragma version(1)
#pragma rs java_package_name(my.package.name)
#pragma rs_fp_full
float sizeX, sizeY;
float ratio;
static float fbm(float2 coord)
{ ... }
uchar4 RS_KERNEL root(uint32_t x, uint32_t y)
{
float u = x / sizeX * ratio;
float v = y / sizeY;
float2 p = {u, v};
float res = fbm(p) * 2.0f; // rs.: 8245 ms, fs: 8307 ms; fs 9842 ms on tablet
float4 color = {res, res, res, 1.0f};
//float4 color = {p.x, p.y, 0.0, 1.0}; // rs.: 96 ms
return rsPackColorTo8888(color);
}
As a comparison, this exact algorithm runs with at least 30 fps when I implement it on the gpu via fragment shader on a textured quad.
The overhead for running the RenderScript should be max 100 ms which I calculated from making a simple bitmap by returning the x and y normalized coordinates.
Which means that in case it would use the gpu it would surely not become 10 seconds.
The code I am using the RenderScript with:
// The non-support version gives at least an extra 25% performance boost
import android.renderscript.Allocation;
import android.renderscript.RenderScript;
public class RSNoise {
private RenderScript renderScript;
private ScriptC_noise noiseScript;
private Allocation allOut;
private Bitmap outBitmap;
final int sizeX = 1536;
final int sizeY = 2048;
public RSNoise(Context context) {
renderScript = RenderScript.create(context);
outBitmap = Bitmap.createBitmap(sizeX, sizeY, Bitmap.Config.ARGB_8888);
allOut = Allocation.createFromBitmap(renderScript, outBitmap, Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_GRAPHICS_TEXTURE);
noiseScript = new ScriptC_noise(renderScript);
}
// The render function is benchmarked only
public Bitmap render() {
noiseScript.set_sizeX((float) sizeX);
noiseScript.set_sizeY((float) sizeY);
noiseScript.set_ratio((float) sizeX / (float) sizeY);
noiseScript.forEach_root(allOut);
allOut.copyTo(outBitmap);
return outBitmap;
}
}
If I change it to FilterScript, from using this help (https://stackoverflow.com/a/14942723/4420543), I get several hundred milliseconds worse in case of support library and about double time worse in case of the non-support one. The precision did not influence the results.
I have also checked every question on stackoverflow, but most of them are outdated and I have also tried it with a nexus 5 (7.1.1 os version) among several other new devices, but the problem still remains.
So, when does RenderScript run on GPU? It would be enough if someone could give me an example on a GPU-running RenderScript.