Also asked here with no luck (https://groups.google.com/forum/#!topic/android-developers/Rh_L9Jv_S8Q)
I'm trying to figure out how to do half-precision using types like half
and half4
. The only problem seems to be getting the numbers from java to renderscript and back.
The Java Code:
private float[] input;
private float[] half_output;
private RenderScript mRS;
private ScriptC_mono mScript;
private final int dimen = 15;
...
//onCreate
input = new float[dimen * dimen * 3]; //later loaded from file 182.24 3.98 105.83 226.08 15.2 80.01...
half_output = new float[dimen * dimen * 3];
...
//function calling renderscript
mRS = RenderScript.create(this);
ScriptC_halfPrecision mScript = new ScriptC_halfPrecision(mRS);
Allocation input2 = Allocation.createSized(mRS, Element.F16(mRS), dimen * dimen * 3);
input2.copyFromUnchecked(input); //copy float values to F16 allocation
Allocation halfIndex = Allocation.createSized(mRS, Element.F16(mRS), dimen * dimen);
Type.Builder half_output_type = new Type.Builder(mRS, Element.F16(mRS)).setX(dimen * dimen * 3);
Allocation output3 = Allocation.createTyped(mRS, half_output_type.create());
mScript.set_half_in(input2);
mScript.set_half_out(output3);
mScript.forEach_half_operation(halfIndex);
output3.copy1DRangeToUnchecked(0, dimen * dimen * 3, half_output); //copy F16 allocation back to float array
The Renderscript:
#pragma version(1)
#pragma rs java_package_name(com.example.android.rs.hellocompute)
rs_allocation half_in;
rs_allocation half_out;
half __attribute__((kernel)) half_operation(uint32_t x) {
half4 out = rsGetElementAt_half4(half_in, x);
out.x /= 2.0;
out.y /= 2.0;
out.z /= 2.0;
out.w /= 2.0;
rsSetElementAt_half4(half_out, out, x);
}
I also tried this instead of the last line shown in the Java code:
float temp_half[] = new float[1];
for (int i = 0; i < dimen * dimen * 3; ++i) { //copy F16 allocation back to float array
output3.copy1DRangeToUnchecked(i, 1, temp_half);
half_output[i]=temp_half[0];
}
All the above code works perfectly for float4
variables in the renderscript and F32
allocations in the java.
This is obviously because there is no issue going from renderscript float
to java float
.
But trying to go from java float
(since there is no java half
) to renderscript half
and back again is very difficult.
Can anyone tell me how to do it?
Both of the above versions of the java code result in seemingly random values in the half_output
array.
They are obviously not random because they are the same values every time I run it, no matter what the operation in the half_operation(uint32_t x)
function.
I've tried changing the out.x /= 2.0;
(and corresponding y,z,w code) to out.x /= 2000000.0;
or out.x *= 2000000.0;
and still the values that end up in the half_output
array are the same every time I run it.
Using input of 182.24 3.98 105.83 226.08 15.2 80.01...
Using this java
output3.copy1DRangeToUnchecked(0, dimen * dimen * 3, half_output); //copy F16 allocation back to float array
The resulting half_output is 46657.44 27094.48 3891.45 965.1825 36223.44 14959.08...
Using this java
float temp_half[] = new float[1];
for (int i = 0; i < dimen * dimen * 3; ++i) { //copy F16 allocation back to float array
output3.copy1DRangeToUnchecked(i, 1, temp_half);
half_output[i]=temp_half[0];
}
The resulting half_output is 2.3476E-41 2.5546E-41 6.2047E-41 2.5407E-41 1.9802E-41 2.4914E-41...
Again these are the results no matter what I change the out.x /= 2.0;
algorithm to.