0

I am having trouble understanding the purpose of the -xhost flag used with icc.

On the intel website, it states:

xHost, QxHost

Tells the compiler to generate instructions for the highest instruction set available on the compilation host processor.

I am not sure what is meant by "highest instruction set".

Also, I see something about SIMD here. If -xhost can speed up your code, why would someone choose not to use this flag?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
5Pack
  • 313
  • 1
  • 9
  • 5
    It's pretty much the same thing as GCC/clang `-march=native`, see [What are my available march/mtune options?](https://stackoverflow.com/q/53156919) / [GCC: how is march different from mtune?](https://stackoverflow.com/q/10559275) . Also related [What exactly do the gcc compiler switches (-mavx -mavx2 -mavx512f) do?](https://stackoverflow.com/a/71234534) which mentions the ICC / MSVC vs. clang/GCC difference in terms of (not) needing to enable ISA extension to use intrinsics for them. – Peter Cordes Mar 15 '22 at 22:08

2 Answers2

4

The -xhost flag generates the most optimal code possible, based on the capabilities of your current CPU (that is, the one in the computer you're using to do the compilation).

By "highest instruction set", it means that the compiler will automatically turn on the code-generation flags corresponding to the highest instruction set supported by your CPU. So, if your CPU only supports SSE2, then that's all that will be turned on. If it supports AVX2, then that option will be turned on. Whatever the highest instruction set extension that your CPU supports, the compiler will generate code targeting that instruction set extension.

This option is generally used when you want to build code to run on the same computer where you're building it. For example, when building a scientific algorithm that you'll run on the same computer, or when compiling your own Linux kernel.

Technically speaking, the generated binaries will run on any computer that supports at least the same instruction set extensions as the build computer, which is why the documentation talks about "the highest instruction set available on the compilation host processor".

As Peter Cordes already noted in a comment, ICC's -xhost flag is essentially equivalent to GCC and Clang's -march=native flag. Both of them tell the compiler to automatically turn on all options that match what the host CPU is capable of, generating the most optimal binary possible for the host CPU, but which will run on other CPUs, as long as they have equal or higher capabilities.

You can do exactly the same thing that -xhost is going to do by looking up the specifications for your computer's CPU and adding the corresponding code-gen options to the compiler command line. -xhost just does it for you, looking up what your host CPU supports and enabling those flags automatically, without you having to do the legwork. So, it is a convenience feature; nothing more, nothing less.

The -xhost flag can, indeed, speed up your code by taking advantage of certain instruction set extensions, but it can also result in a binary that won't work at all (on a different computer that doesn't support the same instruction set extensions as your build computer). Maybe that's not a problem for you; in that case, you'd definitely turn on the -host flag. But, in many cases, we software developers are building binaries for other people to run, and in that case, we have to be a bit more careful about exactly which CPUs we want to exclude.

It is also worth noting that Intel's compiler can actually generate a single executable with dynamic dispatching support that allows you to support two different architectures. See Sergey L.'s answer to a related question for more details.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
  • *You can do exactly the same thing ...* - does ICC `-xhost` not imply `-mtune=native`? Manually turning on `-mavx2 -mfma -mbmi2 -mcx16` and so on wouldn't set tune options. Unless you meant looking up the appropriate option like `-march=icelake-server` if that's what your host is. (ICC does accept `-march=` and `-mtune=` for compatibility with GCC, with I think the same effect.) – Peter Cordes Mar 16 '22 at 02:11
  • @PeterCordes Hmm, that's a good point... Don't know the answer off the top of my head. As far as I know, `-[Q]xHost` is equivalent to specifying `-march=native`, without specifying any tuning, but I am not 100% certain of that. [The documentation](https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/compiler-option-details/code-generation-options/xhost-qxhost.html) I found doesn't clarify this, either. But, yes, I was thinking of something like `-march=`. – Cody Gray - on strike Mar 16 '22 at 04:24
  • 1
    ICC is a bit confusing because, as you said, it accepts both GCC-style *and* MSVC-style compiler options for compatibility, and those tend to have slightly different semantics. MSVC doesn't have anything equivalent to `-mtune`, but the docs I'm looking at say that ICC supports `/tune:processor` as an additional option on Windows on top of its MSVC-compatible options. – Cody Gray - on strike Mar 16 '22 at 04:25
1

To add to the answer by @Cody Gray: sometimes you don't want to use the -xhost flag. On a supercomputer cluster you often do your compilation on a "login node" and your code executes on a "compute node". These two can have slightly (or sometimes: very) different architectures. So you tell the login node for what architecture to compile, but you don't use the xhost flag which might make it unexecutable on the compute node.

Victor Eijkhout
  • 5,088
  • 2
  • 22
  • 23
  • Thanks for adding some additional information! It seems odd that you would compile on the "login node" instead of on the "compute node". Isn't compilation computation? Wouldn't you want it running as efficiently as possible? (I say this with limited to no experience using a supercomputer cluster, of course; just noodling about the semantic meaning of the terms.) – Cody Gray - on strike Mar 16 '22 at 04:26
  • @CodyGray Compute nodes are accessed through a batch mechanism. They are in great demand so may not be immediately available. The login nodes are for setting up your job, which includes moving data about, setting up batch scripts, compiling, and doing minimal post-processing. That said, if you are installing a gigantic library that takes half an hour to compile, you should do it still on a compute node. – Victor Eijkhout Mar 16 '22 at 10:34