The shortest way to write what you want, is
void foo(Eigen::VectorXf& inout, float threshold)
{
inout = (threshold < inout.array().abs()).select(inout, 0.0f);
}
However, neither comparisons nor the select
method get vectorized by Eigen (as of now).
If speed is essential, you need to either write some manual SIMD code, or write a custom functor which supports a packet
method (this uses internal functionality of Eigen, so it is not guaranteed to be stable!):
template<typename Scalar> struct threshold_op {
Scalar threshold;
threshold_op(const Scalar& value) : threshold(value) {}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const Scalar operator() (const Scalar& a) const{
return threshold < std::abs(a) ? a : Scalar(0); }
template<typename Packet>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const Packet packetOp(const Packet& a) const {
using namespace Eigen::internal;
return pand(pcmp_lt(pset1<Packet>(threshold),pabs(a)), a);
}
};
namespace Eigen { namespace internal {
template<typename Scalar>
struct functor_traits<threshold_op<Scalar> >
{ enum {
Cost = 3*NumTraits<Scalar>::AddCost,
PacketAccess = packet_traits<Scalar>::HasAbs };
};
}}
This can then be passed to unaryExpr
:
inout = inout.unaryExpr(threshold_op<float>(threshold));
Godbolt-Demo (should work with SSE/AVX/AVX512/NEON/...): https://godbolt.org/z/bslATI
It might actually be that the only reason for your slowdown are subnormal numbers. In that case, a simple
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
should do the trick (cf: Why does changing 0.1f to 0 slow down performance by 10x?)