I'm currently coding an application in C# which could benefit a great deal from using SSE, as a relative small piece of code causes 90-95% of the execution time. The code itself is also perfect for SSE (as it's matrix and vectorbased), so I went ahead and started to use Mono.Simd and even though this made a significant difference in execution time, this still isn't enough. The problem with Mono.Simd is that it only has very old SSE-instruction (mainly from SSE1 en SSE2, I believe), which causes the dotproduct (or scalar/inner product) for example to take up 3 instructions, while it can be implemented with SSE4 in only 1 instruction (and since SSE4 is available since 2006 one can safely assume that every modern computer has it by now). Also, a bunch of other functions aren't included at all (get the absolute value of every number for example, which will also need a clumsy workaround).
My question is, are there any other libraries I can call from within my C# code to make use of SSE/SIMD? It's also possible to use inline assembly in C#, so apparently I can also use C++-code, even though this causes a small performance hit, but if anyone would have a relatively easy-to-use C++ library with said functions this would be acceptable I guess.
Thanks in advance for any help.