Questions tagged [iaca]

IACA (The Intel Architecture Code Analyzer) is a static analysis tool made by Intel to assist programmers in scheduling instructions optimally for its modern Intel architecture processors, starting with Nehalem.

8 questions
62
votes
4 answers

Micro fusion and addressing modes

I have found something unexpected (to me) using the Intel® Architecture Code Analyzer (IACA). The following instruction using [base+index] addressing addps xmm1, xmmword ptr [rsi+rax*1] does not micro-fuse according to IACA. However, if I use…
Z boson
  • 32,619
  • 11
  • 123
  • 226
58
votes
1 answer

What is IACA and how do I use it?

I've found this interesting and powerful tool called IACA (the Intel Architecture Code Analyzer), but I have trouble understanding it. What can I do with it, what are its limitations and how can I: Use it to analyze code in C or C++? Use it to…
Iwillnotexist Idonotexist
  • 13,297
  • 4
  • 43
  • 66
6
votes
1 answer

Intel IACA analyzer alters assembly?

I wanted to run some code through IACA analyzer to see how many uops it was using-- I started with a simple function to see if it was working.. Unfortunately when I insert the macros IACA says to use, the resulting assembly was very different, and…
Froglegs
  • 1,095
  • 1
  • 11
  • 21
4
votes
1 answer

Intel broadwell uop fusion for AVX load/store instructions

I'm trying to identify a performance baseline for memory-bound vectorized loops. I'm doing this on an Intel Broadwell chip with AVX2 instructions in a 32byte aligned environment. A baseline loop uses 8 YMM registers at a time to load from one…
Gavin Portwood
  • 1,217
  • 8
  • 9
3
votes
1 answer

Using IACA with non-assembly routine

I've been playing around with IACA (Intel's static code analyser). It works fine when testing with assembly snippets where I can input the magic marker bytes manually, like this: procedure TSlice.BitSwap(a, b: integer); asm //RCX = self //edx =…
Johan
  • 74,508
  • 24
  • 191
  • 319
2
votes
2 answers

Performance difference between two seemingly equivalent assembly codes

tl;dr: I have two functionally equivalent C codes that I compile with Clang (the fact that it's C code doesn't matter much; only the assembly is interesting I think), and IACA tells me that one should be faster, but I don't understand why, and my…
Dada
  • 6,313
  • 7
  • 24
  • 43
1
vote
0 answers

Why does my program with IACA markers compile but not when I compile to assembly first?

I'm trying to do some code profiling with Intel's IACA. I've used this Stack Overflow question to set up the makers. The problem I'm having is that if I use gcc and compile straight from the source to the binary, I'm fine. The program compiles and…
Patrick
  • 51
  • 6
0
votes
0 answers

How to generate IACA analysis report for a c program?

I would like to analyze the effect, if any, #pragma GCC unroll n has on a simple for-loop summation program in C. From my research, I learned of the IACA tool and have downloaded it but I am having a hard time getting an analysis report as it is…
Jin
  • 13
  • 2