I am currently porting some applications to use the ARM SVE features with the intrinsic functions as defined in ARM C Language extensions for SVE.
Upon checking the documentation I have come across two functions to sum up elements of the floating point vector using reduction. That is using left-to-right and tree based reduction.
float64_t svadda[_f64](svbool_t pg, float64_t initial, svfloat64_t op);
float64_t svaddv[_f64](svbool_t pg, svfloat64_t op);
Documentation:
These functions (ADDV) sum all active elements of a floating-point vector. They use a tree-based rather than left-to-right reduction, so the result might not be the same as that produced by ADDA."
Why would a tree-based reduction differ from left-to-right reduction? Do they mean this because of the rounding errors or am I missing something?