I need to compare two dgemm files, but at the assembly code level. The one that works more efficiently, I noticed that it has 'ymm0', 'ymm1', ect.. Please explain what this is? Would it change the efficiency in the code?
Asked
Active
Viewed 57 times
0
-
2They're 256-bit SIMD vector registers. https://en.wikipedia.org/wiki/Advanced_Vector_Extensions. – Peter Cordes Apr 16 '20 at 03:14
-
See Agner Fog's asm optimization guide (https://agner.org/optimize/), specifically the SSE/AVX SIMD chapter, Intel's [x86 manuals](https://software.intel.com/en-us/articles/intel-sdm), and https://stackoverflow.com/tags/avx/info – Peter Cordes Apr 16 '20 at 03:42
-
3Put simply: ymm registers are bigger than most other registers. With a single instruction, you can compare 32 "bytes" of data instead of (more commonly) 32 "bits" of data. There's tricks, caveats and limitations, but that's probably the heart of the speed difference. – David Wohlferd Apr 16 '20 at 05:45