3

So, here is what I am trying to accomplish. In my C++ project that has to be compiled with Microsoft Visual Studio 2015 or above, I need to have some code have different versions depending on the newest SIMD instrunction set available in the CPU of the user, among: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512.

Since what I am look for at this point is compile-time CPU dispatching, my first guess was that it could be easily accomplished using compiler macros. However, to my astonishment, it has been quite hard to find information on how to achieve such CPU dispatching with macros in VS2015.

For instance, the former question "Detect the availability of SSE/SSE2 instruction set in Visual Studio" has information on how to detect SSE and SSE2 for x86 code, but not for x64 code. Although, they make a reference to this Microsoft's document: http://msdn.microsoft.com/en-us/library/b0084kay.aspx

There, we only have information on how to detect whether SSE, SSE2, AVX and AVX2 are enabled in the compiler - not exactly whether they are supported by CPU. Also, there is nothing at all about the other instrunction sets, like SSE3, SSSE3, SSE4.1, SSE4.2 and AVX512.

So, my question becomes: how can I detect whether the user's CPU supports those instrunction sets via macro, just like other compilers do, but with Microsoft Visual Studio 2015?

Community
  • 1
  • 1
user123443563
  • 171
  • 1
  • 2
  • 8
  • 1
    Unless you're only going to run on the same machine you compile on, the set of supported CPU features is not a compile-time constant, and thus can't be a macro. But if you are, just do whatever the MSVC-equivalent of gcc's `-march=native` is, and look at the usual target-feature macros like `#ifdef __AVX__` – Peter Cordes Nov 23 '16 at 06:20
  • 1
    @PeterCordes But that is exactly the problem. There are no macros besides `__AVX__` and `__AVX2__`. The whole point of my question is precisely to investigate how people have achieved that *because* Vistual Studio seems to lack such macros. – user123443563 Nov 23 '16 at 06:24
  • It should also define `__SSE2__`, `__SSSE3__`, and so on. I don't know MSVC, but with gcc you can use `-dM` to have the preprocessor dump all the macro definitions. (e.g. `echo | gcc -march=haswell -E - -dM | less`). `__AVX__` implies all the Intel SSE extensions up to and including `__SSE4_2__`. It would be weird if MSVC didn't bother to still define macros for them, though. – Peter Cordes Nov 23 '16 at 06:33
  • But anyway, just to confirm, you are compiling binaries that only have to run on the CPU that compiled them? So you only need compile-time CPU dispatching, not run-time? If so, then at worst manually enable everything your CPU supports, and use the usual macros. All other compilers only define macros according to what's enabled, just like you describe. (But like I said, gcc's `-march=native` enables everything the host supports, and also enables `-mtune=native`, to tune for it as well as just enabling use of instruction-set extensions it supports. e.g. don't bother with REP RET for Intel) – Peter Cordes Nov 23 '16 at 06:34
  • 1
    @PeterCordes As weird as it may be, that seems to be the case. The list of macros for MSVC only include macros for AVX and AVX2. And as linked in the question, an indirect macro for SSE2. That's the origin of my question: it's very hard to find information on this. About your question, I need both. For this part, I need compile time dispatching. But for other parts, I need runtime-dispatching - in which case I am using`__cpuid` as per Intel's manual and it works well for most part. – user123443563 Nov 23 '16 at 06:38
  • I suggest you simplify your question by removing most of the stuff about runtime dispatching. It reads a lot like you think you can use macros as an alternative to runtime CPU dispatching using CPUID, and get equivalent functionality. Also the way you talk about "what the CPU supports" vs. what's enabled at compile time sounds weird to me. – Peter Cordes Nov 23 '16 at 06:42
  • @PeterCordes Oh, you were right. The last part was mixing a different issue that I have with runtime-dispatching and for which I should ask a separate question in the future if I can't properly find a workaround. So, I did simplify the question. About the part "what CPU supports" vs. "what's enabled at compile time": Microsoft's macros for VS detect whether AVX or AVX2 are enabled in the options. For them to enabled, of couse CPU supports them. But CPU supporting them does not mean they will be enabled - thus what is being detected can be misleading. Still, that's a less important bit – user123443563 Nov 23 '16 at 06:57
  • 1
    *For them to enabled, of couse CPU supports them*. The problem here is saying "the CPU", like there was only one. If you're running the compiler on an old computer, building a binary that only needs to run on a new computer, you would enable AVX. The resulting binary won't run on the computer that built it, but it can create it just fine. In build-system and compiler terminology, we talk about "target" vs. "host". So enabling AVX is a target option, and macros can detect what's baseline for your target (i.e. enabled for use by the compiler). – Peter Cordes Nov 23 '16 at 07:00
  • re: updated question: *how to detect SSE and SSE2 for x86 code, but not for x64 code*. SSE2 is baseline for x86-64. You don't need to detect SSE2 for x86-64, because it's *always* enabled. However, the `__SSE2__` macro should still be defined, so you can just write `#ifdef __SSE2__` instead of `#if defined(__SSE2__) || defined(__x86_64__)`. I thought that was portable to MSVC, but you seem to be saying it isn't. – Peter Cordes Nov 23 '16 at 07:03
  • Visual Studio 2015 projects (vcxproj) are MsBuild files. You could write a custom task (https://msdn.microsoft.com/en-us/library/t9883dzc.aspx) that outputs defines/environment variable with everything you need, before the build phase. – Simon Mourier Nov 23 '16 at 07:37
  • 2
    "Microsoft's macros for VS detect whether AVX or AVX2 are enabled in the options. For them to enabled, of couse CPU supports them. " No, that's a mistake. You can compile AVX code on a non-AVX CPU. "Compiling AVX code" just means that the compiler emits byte strings which encode AVX operations. Compiling doesn't involve executing the emitted byte strings. – MSalters Nov 23 '16 at 09:36
  • Well, SSE2 is built into the AMD64 instruction set, so if you compile for 64 bit, you can assume that's the lowest available vector extension. – MarcusJ May 26 '18 at 06:52

1 Answers1

1

The problem you're facing is that Visual Studio historically is intended for software vendors. The idea that you compile your own software simply isn't in Microsoft's DNA.

The practical result is that Microsoft hardly cares about the processor of the build machine. That's unlikely to be the processor used to run the software.

On the upside, this also means that Microsoft doesn't suffer from the perennial Linux problem that the build system libraries are assumed to be present on the target machine. Building on Windows 10 for Windows 7 just works.

The compiler also doesn't allow you to enable up to SSE4.1, for example. You can only use /arch:avx or nothing. Also, that option only defines __AVX__, not the usual macros like __SSSE3__ that gcc/clang/icc define to indicate target support for previous instruction sets implied by AVX.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Can you make any sense of the OP's claims that MSVC [defines `__AVX__` but not for example `__SSSE3__`](http://stackoverflow.com/questions/40757107/detecting-simd-instruction-sets-to-be-used-with-c-macros-in-visual-studio-2015/40759182#comment68739878_40757107)? That seems highly unlikely, unless the same info is present in differently-named macros. Also, I'm not sure if this question is asking about that, or how to actually do the equivalent of `gcc -march=native` (which you're answering by saying visual studio doesn't support that). – Peter Cordes Nov 23 '16 at 08:31
  • 1
    @PeterCordes: No, it's correct. There is an `__AVX__` macro, but no `__SSSE3__` macro. That's logical: the former matches the `/arch:AVX` command-line option, but there's no `/arch:SSSE3` option. (There's an `/arch:SSE2 option, but not for x64) – MSalters Nov 23 '16 at 09:32
  • Ok, that explains why Agner Fog's vectorclass library uses its own [INSTRSET](https://github.com/pcordes/vectorclass/blob/master/instrset.h#L28) macro; I hadn't looked at the comment about MSVC. I had hoped that macros like `__SSE3__` were portable. :( Also, I'm surprised that there's no way to enable the compiler to auto-vectorize with up to SSSE3 but no higher, for example. I guess the compiler doesn't stop you from using intrinsics for instruction-sets you haven't enabled (like gcc/clang do), if there's no way to enable them at all. – Peter Cordes Nov 23 '16 at 09:58
  • 1
    @PeterCordes http://stackoverflow.com/questions/18563978/sse-sse2-is-enabled-control-in-visual-studio/18570487#18570487 – Z boson Nov 23 '16 at 11:21
  • 1
    I am not sure why MSVC has /arch:AVX2 and defines `__AVX2__`. I have been wondering why they do this because they don't for example define SSE4.1. `/arch:AVX` makes sense in order to specifiy vex encoding but why `/arch:AVX2`? GCC optimizes the code based on the instruction set you define but MSVC does not do this for SSE except in some cases such as auto-vectorizating they actually build in code to detect SSE4.1 so the code works for SSE2 and is optimized for SSE 4.1 as well. With GCC you would specify `-msse4.1` and the code would opitmize for that but not work for SSE2. – Z boson Nov 23 '16 at 11:26
  • 1
    Here is the example where MSVC auto-detects SSE4.2 and uses it with auto-vectorization http://stackoverflow.com/a/25495173/2542702 – Z boson Nov 23 '16 at 11:42
  • @Zboson: Note that it's a _runtime_ detection. The compiler doesn't do a compile-time detection of SSE4.2. – MSalters Apr 24 '18 at 12:30
  • @MSalters, that's what I meant but `auto-detects`. MSVC implicitly checks at runtime. GCC can check at compile time with `-march=native` or at runtime with `__builtin_cpu_supports`. However `__builtin_cpu_supports` [does not check AVX properly](https://stackoverflow.com/a/48764832/2542702), – Z boson Apr 25 '18 at 07:41