2

I ran into this problem while compiling my project with Clang. I want to use the intrinsic function _m_prefetchw for that I included x86intrin.h, but for some reason my flow is not reaching the _m_prefetchw definition. I checked the x86intrin.h header file of Clang and I dont have the __PRFCHW__ defined in order to include prfchwintrin.h although I do have PREFETCHW supported by my PC (I ran coreinfo to know this).

does anyone know why __PRFCHW__ isn't defined although I have PREFETCHW supported?

code example:

#include <x86intrin.h>

int main(){
    int i = 10;
    _m_prefetchw(&i);
    return 0;
}

After running I get the error error LNK2019: unresolved external symbol _m_prefetchw referenced in function main

I dug into my clang include header files and found this in x86intrin.h:

#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PRFCHW__)
#include <prfchwintrin.h>
#endif

And _m_prefetchw is defined in the prfchwintrin.h file.

My processor is Intel Xeon E5-2690, Clang version is 9.0.1.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
super-user
  • 74
  • 8
  • 3
    Please reduce your code to [mre] (including compiler version and compile options) and post that here. – chtz Jul 23 '20 at 08:42
  • 3
    The comment by @chtz is very relevant, as are your clang version. Now, have you checked (with `objdump`) if your binary contains the prefetch instruction? Do you pass `-march` flags to `clang`? Can you compile an inline assembly snippet with this instruction? Finally, will the `__builtin_prefetch` work for your needs? – TSG Jul 23 '20 at 08:51
  • Thanks I updated the question, please tell me if you have any more comments. – super-user Jul 23 '20 at 12:31
  • I didn't pass the -march flag, because I want to compile on my machines arch. I want to know why _m_prefetchw is not defined. – super-user Jul 23 '20 at 12:33
  • 2
    If you want to compile for your machine's own architecture, use `-march=native`. By default clang will compile for whatever architecture was selected when it was built, which is usually some "lowest common denominator" model that isn't assumed to support this instruction. – Nate Eldredge Jul 23 '20 at 12:53
  • Thanks for the info, I'm still getting the same error even with the flag set. With -march=broadwell it does compile but I want to know why its not compiling on my machines processor. – super-user Jul 23 '20 at 12:56
  • 3
    Ah, interesting. Right, older Intel CPUs can run it as a NOP, but only Broadwell and later actually advertizes the CPU feature in its CPUID, and only Broadwell and later actually do anything besides a NOP. (AMD CPUs support it properly). So I assume your CPU is a Haswell or older. [What is the effect of second argument in \_builtin\_prefetch()?](https://stackoverflow.com/q/40513280). So `-march=native` only includes `-mprfchw` if it will actually have an effect. Related: [Windows 10 64-bit requirements: Does my CPU support PrefetchW?](https://superuser.com/posts/comments/1645066) – Peter Cordes Jul 23 '20 at 13:08
  • What actual CPU model do you have? – Peter Cordes Jul 23 '20 at 13:13
  • I did mention it in the question "although I do have PREFETCHW supported by my PC (I ran coreinfo to know this)." and by the coreinfo details my CPU does support PREFETCHW. I have Intel Xeon E5-2690 processor. – super-user Jul 23 '20 at 13:17
  • Right, just noticed that at the bottom. Earlier I only saw the coreinfo details which is apparently only checking for "won't fault", not for the CPUID feature bit and/or actually having a performance effect. i.e. I could tell the coreinfo result must be useless for this purpose. – Peter Cordes Jul 23 '20 at 13:33

1 Answers1

1

Manually use -mprfchw to tell the compiler to let you use _m_prefetchw even when compiling for a -march= where prefetchw is only a NOP.

-march=native only includes -mprfchw if it will actually have an effect. See What is the effect of second argument in _builtin_prefetch()? for more details on how compilers "think about" availability of prefetch instructions and CPUID.


Your E5-2690 is a Sandybridge, older than Broadwell which introduced (on the Intel side) real support for PREFETCHW.

Any non-ancient Intel CPUs can run prefetchw as a NOP (http://ref.x86asm.net/coder64.html#gen_note_NOP_0F0D), but only Broadwell and later actually advertizes the CPU feature in its CPUID, and only Broadwell and later actually do anything different from a NOP. (AMD CPUs support it as an actual prefetch into Exclusive state ever since 3DNow! introduced it.)

Running as a NOP instead of faulting is apparently necessary for installing 64-bit Windows, so a lot of discussion about "supporting" PREFETCHW revolves around not faulting, rather than its CPUID bit and actually doing anything. For example, comments on Windows 10 64-bit requirements: Does my CPU support PrefetchW? discuss this difference in "support" (as in won't fault) vs. "support" as in actually does something.

This forum thread mentions that P4 Nocona faults on prefetchw, and thus can't install Windows 8.1. But Core2 and later do have "won't fault" support.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847