1

I'm a hobbyist who likes to run my own programs in Go, and as Xeon Phi processors become older they're also becoming extremely cheap. So cheap I can build a dual socket machine from 2015/16 for <$1000

I'm trying to find out if I can run Go programs on these. From what I've seen, this thread says they won't run (and to try gccgo), but it says it won't run because it partially runs on an x87 ISA. Confusingly, in Go release notes they say they're dropping x87 support in 1.16, implying it was supported in the past. I've seen in other threads that all programs will run on the compatibility layer, but that's an extremely slow layer which only has access to a small portion of the cpu's cache.

I feel like I'm moving farther and farther out of my element; I was wondering if someone who's used Xeon Phi knows if it will run Go code? Or just in general, after booting up Ubuntu (or FreeBSD, something that I've seen done and is listed in motherboard specs) what sort of things aren't going to work and what will?

I appreciate any and all help!

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
haxonek
  • 174
  • 1
  • 2
  • 17
  • Normal FP math should be using scalar SSE instructions, not x87. I don't know Go, but unless it has a "long double" type or equivalent then 64-bit code should have not reason to use the x87 FPU. Of course, KNL is x86-64 including an x87 FPU, so it would run but with pretty bad performance because its x87 FPU isn't fully pipelined for all instructions, e.g. one fmul per 2 cycles. – Peter Cordes Jan 20 '21 at 01:33
  • @PeterCordes Upon re-reading I think I typed that up improperly. From the first link someone explains "No, I think you should compile and expect it to *not* work. The Knight's Corner processor is based on an x86-64 foundation, yes, but it in fact has its own floating-point instruction set—no x87, no AVX, no SSE, no MMX... Oh, and then you can throw all that away when Knight's Landing (KNL) comes out. It uses AVX-512F as its floating-point instruction set" But even this might be wrong, bc I can't tell if they're talking about the processors or the co-processors – haxonek Jan 20 '21 at 02:05
  • Umm so what? you would handle then during compilation... me = cornfused. If you wanted to emulate x86 floating point e.g. x87 then that is also possible but slower... – Jay Jan 20 '21 at 02:06
  • 1
    Oh, you're talking about early Xeon Phi, based on KNC (Knight's **Corner**). Yeah that's different, it only has its own variant / precursor of AVX-512 in a core based on P5-Pentium, and maybe no other FP hardware. Your title and tags say Knight's **Landing**, which is silvermont + AVX + AVX2 + AVX-512, including all of baseline x86-64 like normal Silvermont does (up to and including SSE4.1, as well as x87). Agner Fog's instruction tables have KNL timings for x87 instructions like `fmul`, so that's absolute proof it works. https://www.agner.org/optimize/instruction_tables.pdf – Peter Cordes Jan 20 '21 at 02:06
  • I'm looking at purchasing a knights landing processor, I just didn't know if this problem is going to carry over, or if AVX-512F is going to be a problem – haxonek Jan 20 '21 at 02:11
  • 1
    Then you're fine, like I said. KNL *supports* AVX-512F (compatible with Skylake-AVX512 for the [subsets of AVX512 they both have](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512)), but it also supports x87, SSE, and AVX so it can run code that hasn't been compiled to use AVX-512. See https://en.wikipedia.org/wiki/Xeon_Phi. Only KNC leaves out the "legacy" ways of doing FP math. – Peter Cordes Jan 20 '21 at 02:13

2 Answers2

4

You're basing your Knight's Landing worries on this quote about Knight's Corner:

The Knight's Corner processor is based on an x86-64 foundation, yes, but it in fact has its own floating-point instruction set—no x87, no AVX, no SSE, no MMX... Oh, and then you can throw all that away when Knight's Landing (KNL) comes out.

By "throw all that away", they mean all the worries and incompatibilities. KNL is based on Silvermont and is fully x86-64 compatible (including x87, SSE, and SSE2 for both standard ways of doing FP math). It also supports AVX-512F, AVX-512ER, and a few other AVX-512 extensions, along with AVX and AVX2 and SSE up to SSE4.2. A lot like a Skylake-server CPU, except a different set of AVX-512 extensions.

The point of this is exactly to solve the problem you're worried about: so any legacy binary can run on KNL. To get good performance out of it, you want to be running code vectorized with AVX-512 vectors in the loops that do the heavy lifting, but all the surrounding code and other programs in the rest of the Linux distro or whatever can be running ordinary bog-standard code that uses whatever x87 and/or SSE.


Knight's Corner (first-gen commercial Xeon Phi) has its own variant / precursor of AVX-512 in a core based on P5-Pentium, and no other FP hardware.

Knight's Landing (second-gen commercial Xeon Phi) is based on Silvermont, with AVX-512, and is the first that can act as a "host" processor (bootable) instead of just a coprocessor.

This "host" mode is another reason for including enough hardware to decode and execute x87 and SSE: if you're running a whole system on KNL, you're much more likely to want to execute some legacy binaries for non-perf-sensitive tasks, not only binaries compiled specifically for it.

Its x87 performance is not great, though: like one scalar fmul per 2 clocks (https://agner.org/optimize). vs. 2-per-clock SSE mulsd (0.5c recip throughput). Same 0.5c throughput for other SSE/AVX math, including AVX-512 vfma132ps zmm to do 16x single-precision Fused-Multiply-Add operations in one instruction.

So hopefully Go's compiler doesn't use x87 much. The normal way to do scalar math in 64-bit mode (that C compilers and their math libraries use) is SSE, in XMM registers. x86-64 C compilers only use x87 for types like long double.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
0

Yes:

Xeon Phi is a series of x86 manycore processors designed and made by Intel. It is intended for use in supercomputers, servers, and high-end workstations. Its architecture allows use of standard programming languages and application programming interfaces (APIs) such as... See also https://en.wikipedia.org/wiki/Xeon_Phi

If you can compile go on an x86 processor then you will be able to compile on that specific x86 processor which is manufactured by intel.

Xeon is not Itanium :)

On such systems you would also be able to compile go you would just need to provide a suitable c compiler...

What makes you think you would otherwise not be able to compile go on say... an Atari or perhaps a Arduino?

If you can elaborate on that perhaps I can improve my terrible answer further.

Jay
  • 3,276
  • 1
  • 28
  • 38
  • Hmm the more I look into it the more I think I might be confusing the co-processors with the regular processor. That said I think there's some sort of issue with how floating-point math is done, (AVX-512F?), and I'm not sure if this is going to be completely prohibitive, ruin performance, or just be a non-issue. – haxonek Jan 20 '21 at 02:09
  • 1
    Where there is a will there is a way, you asked about compile not performance I am sure you will make it so :) – Jay Jan 20 '21 at 02:12