Divide by floating-point number using NEON intrinsics

Question

I'm processing an image by four pixels at the time, this on a armv7 for an Android application.

I want to divide a float32x4_t vector by another vector but the numbers in it are varying from circa 0.7 to 3.85, and it seems to me that the only way to divide is using right shift but that is for a number which is 2^n.

Also, I'm new in this, so any constructive help or comment is welcomed.

Example:

How can I perform these operations with NEON intrinsics?

float32x4_t a = {25.3,34.1,11.0,25.1};
float32x4_t b = {1.2,3.5,2.5,2.0};
//    somthing like this
float32x4 resultado = a/b; // {21.08,9.74,4.4,12.55}

Stephen Canon · Accepted Answer · 2011-07-25T21:38:24.993

25

The NEON instruction set does not have a floating-point divide.

If you know a priori that your values are not poorly scaled, and you do not require correct rounding (this is almost certainly the case if you're doing image processing), then you can use a reciprocal estimate, refinement step, and multiply instead of a divide:

// get an initial estimate of 1/b.
float32x4_t reciprocal = vrecpeq_f32(b);

// use a couple Newton-Raphson steps to refine the estimate.  Depending on your
// application's accuracy requirements, you may be able to get away with only
// one refinement (instead of the two used here).  Be sure to test!
reciprocal = vmulq_f32(vrecpsq_f32(b, reciprocal), reciprocal);
reciprocal = vmulq_f32(vrecpsq_f32(b, reciprocal), reciprocal);

// and finally, compute a/b = a*(1/b)
float32x4_t result = vmulq_f32(a,reciprocal);

edited Jul 25 '11 at 21:38

answered Jul 25 '11 at 21:10

Stephen Canon

103,815
19
183
269

2

I also taught in this kind of solution but i didn't know about vrecpeq_f32, so thank you very much. I think neon intrinsics needs better documentation about the functions that it has – Darkmax Jul 25 '11 at 22:22
1

@Darkmax: you should download the architecture reference manuals from ARM, rather than relying on the NEON header documentation. – Stephen Canon Jul 25 '11 at 22:26

Divide by floating-point number using NEON intrinsics

1 Answers1

Linked