Rounding numbers below a certain threshold to zero in Eigen

Question

I'm using Eigen extensively in a scientific application I've been developing for some time. Since I'm implementing a numerical method, a number below a certain threshold (e.g. 1e-15) is not a point of interest, and slowing down calculations, and increasing error rate.

Hence, I want to round off numbers below that threshold to 0. I can do it with a for loop, but hammering multiple relatively big matrices (2M cells and up per matrix) with a for-if loop is expensive and slowing me down since I need to do it multiple times.

Is there a more efficient way to do this with Eigen library?

In other words, I'm trying to eliminate numbers below a certain threshold in my calculation pipeline.

score 4 · Accepted Answer · answered Feb 03 '19 at 16:57

The shortest way to write what you want, is

void foo(Eigen::VectorXf& inout, float threshold)
{
    inout = (threshold < inout.array().abs()).select(inout, 0.0f);
}

However, neither comparisons nor the select method get vectorized by Eigen (as of now).

If speed is essential, you need to either write some manual SIMD code, or write a custom functor which supports a packet method (this uses internal functionality of Eigen, so it is not guaranteed to be stable!):

template<typename Scalar> struct threshold_op {
  Scalar threshold;
  threshold_op(const Scalar& value) : threshold(value) {}
  EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const Scalar operator() (const Scalar& a) const{
    return threshold < std::abs(a) ? a : Scalar(0); }
  template<typename Packet>
  EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const Packet packetOp(const Packet& a) const {
    using namespace Eigen::internal;
    return pand(pcmp_lt(pset1<Packet>(threshold),pabs(a)), a);
  }
};
namespace Eigen { namespace internal {

template<typename Scalar>
struct functor_traits<threshold_op<Scalar> >
{ enum {
    Cost = 3*NumTraits<Scalar>::AddCost,
    PacketAccess = packet_traits<Scalar>::HasAbs };
};

}}

This can then be passed to unaryExpr:

inout = inout.unaryExpr(threshold_op<float>(threshold));

Godbolt-Demo (should work with SSE/AVX/AVX512/NEON/...): https://godbolt.org/z/bslATI

It might actually be that the only reason for your slowdown are subnormal numbers. In that case, a simple

_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);

should do the trick (cf: Why does changing 0.1f to 0 slow down performance by 10x?)

Thanks for the tips and the information. I'm aware of the slowdown subnormal numbers are creating, that's also partly why I'm explicitly avoiding numbers smaller than 1e-15. Because it's bordering in subnormal territory and they are not useful for my calculations. — bayindirh, Feb 03 '19 at 18:19
What I meant to say is that you can probably save the effort of manually flushing values to zero and just rely on the automatic functionality of your CPU. — chtz, Feb 03 '19 at 18:29
Also thanks for that, it's greatly appreciated. Sorry if my comment sounded kinda terse. English is not my native language. I was trying to say "You're right, I'm also trying to prevent subnormal floats, that's spot on!". Actually, I was looking for ways to enable it in a cross-platform way right now. — bayindirh, Feb 03 '19 at 18:33
Looking at your godbolt-demo, the `.select()` version is automatically vectorized by GCC, but not clang. — ggael, Feb 03 '19 at 21:51

score 1 · Answer 2 · answered Feb 03 '19 at 14:37

1

Eigen has a method called UnaryExpr which applies a given function pointer to every coefficient in a matrix (it has sparse and array variants too).

Will test its performance and update this answer accordingly.

answered Feb 03 '19 at 14:37

bayindirh

425
6
21

Rounding numbers below a certain threshold to zero in Eigen

2 Answers2

Linked