I'm wondering whether it is possible to determine, for a specific processor model, the time an instruction takes to execute and how much variance this value may have due to hardware or architectural properties like pipelining or the processors' state (like preceding or following instructions).
Asked
Active
Viewed 117 times
0
-
1Basically zero (except for mem access), if you're talking about it's actual latency as part of a long dependency chain. Depending on your definition, you might count imperfect scheduling against the throughput of a uop, but that's really a matter of scheduling, not taking different time when dispatched to an execution port. See https://uops.info/ / https://agner.org/optimize/, [What considerations go into predicting latency for operations on modern superscalar processors and how can I calculate them by hand?](https://stackoverflow.com/q/51607391) and [this](//stackoverflow.com/a/44980899) – Peter Cordes Mar 29 '21 at 20:32
-
1The key point is that there isn't a single cost number you add up for each instruction; the cost of each instruction has multiple dimensions (latency, ports, and front-end cost), so the actual bottleneck for a loop could be one of a few different things. – Peter Cordes Mar 29 '21 at 20:38
-
It also depends on what you mean by 'time'. Do you mean core clock cycles, or do you mean wall-clock time? – Andreas Abel Mar 30 '21 at 14:13
-
@AndreasAbel Real, wall-clock time – 2080 Mar 30 '21 at 14:43
-
1Then, the timing variance also depends on things like turbo boost or frequency scaling. – Andreas Abel Mar 30 '21 at 15:07