22

Does anyone know an open-source C++ x86 SIMD intrinsics library?

Intel supplies exactly what I need in their integrated performance primitives library, but I can't use that because of the copyrights all over the place.

EDIT

I already know the intrinsics provided by the compilers. What I need is a convenient interface to use them.

  • What functions exactly do you need? – Jeremiah Willcock Feb 10 '11 at 03:48
  • SSE1/2 with the possibility to upgrade to SSE3/4/AVX in the future. IMO a well maintained library would have added support for all of them already –  Feb 10 '11 at 03:59
  • 2
    SSE2 and object-oriented ? sounds too unrelated for me. – YeenFei Feb 10 '11 at 05:34
  • 1
    IPP is now included with Intel's ICC compiler and there are no royalties or other licensing restrictions when you use IPP routines in your deliverables. What "copyright issues" are you experiencing, exactly ? – Paul R Feb 10 '11 at 08:52
  • @Paul R.: Oh yeah. Somehow it costs 200$ –  Feb 10 '11 at 09:21
  • @YeenFei: well, maybe I expressed myself incorrectly. I don't want to have a bunch of functions that operate one type or another, I want the functionality to be enclosed within the definition of a particular class. –  Feb 10 '11 at 09:23
  • 2
    @jobs34yp: ICC is **free** for non-commercial use on Linux. And if you are aiming at commercial use then the cost of the compiler is negligible compared to the benefits that you will gain on performance-critical code. – Paul R Feb 10 '11 at 11:26

8 Answers8

25

Take a look at libsimdpp header-only C++ SIMD wrapper library.

The library supports several instruction sets via single interface: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AVX512F, XOP, FMA3/4, NEON, NEONv2, Altivec. All of Clang, GCC, MSVC and ICC are suported.

Any differences between instruction sets are resolved by implementing the missing instructions as a combination of supported ones. As a bonus, it's possible to compile the same code for several instruction sets, link the resulting object files to a single executable and use a convenient dynamic dispatch mechanism to run the implementation most tailored to the current processor.

p12
  • 1,161
  • 8
  • 23
user12
  • 128
  • 2
  • 6
  • 1
    Is libdimdpp still supported and developed? – Walter Sep 26 '14 at 10:37
  • I don't know, but we went with version 1.0 because 2.0 was still in 'beta' and were completely disappointed. Not only that there's no documentation: There was no way of loading values into a register (the functions with names suggesting it weren't doing so) and code with float registers won't even compile. I totally advise against using it. – Sebastian Graf May 23 '15 at 16:34
  • 2
    @Sebastian You should look into 2.0 version which is now in release-candidate stage. There should be very few bugs, if any, as all supported configurations are continuously tested. The documentation is also significantly improved. Disclaimer: I'm the author of the library. – p12 Mar 17 '16 at 00:41
  • @Sebastian: As for the issues that you have seen, have you reported them? I assume the library has been used incorrectly, as it has been used in significantly large projects without issues. Having said that, the actual 1.0 release could indeed have bugs, though mostly compiler-specific, as only limited testing have been done across all permutations of configurations that are supported and the development quickly moved on. – p12 Mar 17 '16 at 00:42
  • Well, no, I didn't report them. Looking back, at the time of writing that comment I must have been really frustrated and what I wrote was probably not entirely fair. We are however done with our project (semester-long homework) and I haven't looked into libsimdpp since then. – Sebastian Graf Mar 17 '16 at 22:14
12

There are several libraries that have emerged in recent years to abstract explicit SIMD programming. The most important ones:

The most important thing to look for is to have a usable set of types that correctly abstract the best available SIMD registers and instructions for a given target. And, obviously, full portability to systems without SIMD support.

Vir
  • 449
  • 5
  • 5
  • 2
    The Vectorclass library is not under permissive license -- only GPL or commercial. – p12 Mar 17 '16 at 00:50
6

I wrote a GLSL-style library that will convert to near-perfect quality ASM code.

A very common operation - cross product:

vec4 cross(const vec4 &a, const vec4 &b)
{
    return a.yzxw * b.zxyw - a.zxyw * b.yzxw;
}

would be converted to this assemly code using glsl-sse2:

_Z5crossRK4vec4S1_:
    movaps    (%rsi), %xmm1
    movaps    (%rdx), %xmm2
    pshufd    $201, %xmm1, %xmm5
    pshufd    $210, %xmm2, %xmm0
    pshufd    $210, %xmm1, %xmm4
    pshufd    $201, %xmm2, %xmm3
    mulps     %xmm0, %xmm5
    mulps     %xmm3, %xmm4
    subps     %xmm4, %xmm5
    movaps    %xmm5, (%rdi)
    ret

Please note the library isn't perfect yet, and most likely have unfound bugs as it is still new.

LiraNuna
  • 64,916
  • 15
  • 117
  • 140
5

Have a look at AMD's SSEPlus project, might be what your after

Necrolis
  • 25,836
  • 3
  • 63
  • 101
  • There's no mention of anything called SSEPlus there anymore. – Violet Giraffe Feb 23 '13 at 12:14
  • @VioletGiraffe: AMD has this annoying habbit of breaking links, updated the link to the sourceforge page – Necrolis Feb 23 '13 at 15:36
  • Last update was in 2009, is it stable? – Violet Giraffe Feb 23 '13 at 17:17
  • 2
    @VioletGiraffe: technically the last update was in 2008, as it was only ever meant to include SSE, and thus lacks newer sets like FMA, XOP and VMX. In terms of stability, its basically just a massive set of wrappers, so it should be very stable. – Necrolis Feb 24 '13 at 07:42
3

Microsoft has just released its new "DirectXMath" library. It includes support for SSE2 and NEON intrinsics. Documentation looks decent too.

The DirectXMath API provides SIMD-friendly C++ types and functions for common linear algebra and graphics math operations common to DirectX applications. The library provides optimized versions for Windows 32-bit (x86), Windows 64-bit (x64), and Windows on ARM through SSE2 and ARM-NEON intrinsics support in the Visual Studio compiler.

sleep
  • 4,855
  • 5
  • 34
  • 51
2

Vc is another C++ library that implements vector classes and allows writing vectorized code that is independent from the actual instruction set that is used.

Robert Rüger
  • 851
  • 9
  • 21
1

You might want to look at macstl - although it was originally developed for the Mac (and PowerPC) it now works on Linux and x86 too.

Also, if you're working with images then look at OpenCV - this has SSE-optimised routines for many common image processing tasks and has C and C++ APIs.

Paul R
  • 208,748
  • 37
  • 389
  • 560
0

Which compiler? Visual Studio C++ compiler supports a set of SIMD, SIMD2 and MMX intrinsic functions.

selbie
  • 100,020
  • 15
  • 103
  • 173