A basic implementation (before applying optimizations such as the ones described by Peter in the comments) might work like this:
static unsafe bool ContainsUshort(Span<ushort> data, ushort val)
{
int vecSize = Vector<ushort>.Count;
var value = new Vector<ushort>(val);
int i;
fixed (ushort* ptr = &data[0])
{
int limit = data.Length - vecSize;
for (i = 0; i <= limit; i += vecSize)
{
var d = Unsafe.ReadUnaligned<Vector<ushort>>(ptr + i);
if (Vector.EqualsAny(d, value))
return true;
}
}
for (; i < data.Length; i++)
{
if (data[i] == val)
return true;
}
return false;
}
This requires the System.Runtime.CompilerServices.Unsafe
package for the unsafe read, without that creating a vector from a span (or array too) is much less efficient. By the way the EqualsAny
intrinsic is realized with (v)ptest
instead of (v)pmovmskb
, ptest
typically costs more µops so it is comparatively more important to minimize its impact - but since there is no direct access to ptest
or pmovmskb
(this limitation can be avoided by using the newer platform-specific System.Runtime.Intrinsics.X86
API) the final "vector to condition" AFAIK still has to be done with Vector.EqualsAny
(with a vector filled with 0xFFFF) which is a little silly.. nevertheless it was a little faster on my machine (tested such that the return value will be false
, so the slightly earlier exit of the non-unrolled version didn't come into play)
var allSet = new Vector<ushort>(0xFFFF);
int limit = data.Length - vecSize * 2;
for (i = 0; i <= limit; i += vecSize * 2)
{
var d0 = Unsafe.ReadUnaligned<Vector<ushort>>(ptr + i);
var d1 = Unsafe.ReadUnaligned<Vector<ushort>>(ptr + i + vecSize);
var eq = Vector.Equals(d0, value) | Vector.Equals(d1, value);
if (Vector.EqualsAny(eq, allSet))
return true;
}