Performance: OOP vs processor cache and pipeline

Question

I have recently watched a video explaining that allocating data sequentially is very good for performance. It also showed how using "OOP approach" can slow down applications. To me this seems similar in some way to this brilliant question. Having this in mind, I concluded that calling a virtual method requires 2 indirections (which breaks caching if function access object fields?) and one jump(which breaks pipelining?). Calling a non-virtual method, requires 1 indirection(which still breaks caching?).

Nowadays, OOP is super-extensively used. Are there any attempts to optimize methods (virtual and non-virtual) calls so that they are more friendly to processor cache and pipeline? Perhaps runtimes (.NET, JVM, LLVM) can do some optimizations? Or perhaps modern processors are smart enough that we do not have to worry about that?

Branch prediction doesn't have anything to do with using the processor caches efficiently. A compacting garbage collector is a good way to keep the cache happy. Modern processor are already optimized to cache indirect call targets. — Hans Passant, Jun 02 '14 at 09:49
Indirection through a pointer that is located right beside the data you are operating on (i.e., the object)... — DevSolar, Jun 02 '14 at 09:54
@HansPassant, sorry for not stating clearly enough. Branch prediction has to do with pipelining, that's why I mentioned it. Thanks for idea of compacting GC. Any "official" info/statistics on processors optimizing indirect call targets? — ironic, Jun 02 '14 at 10:13
During actual running of a program a virtual call site might always end up calling the same method: then a JIT optimizer like the one in HotSpot, can at runtime generate non-virtual call code with cheap guards to bail out of the optimization if it becomes invalid. A non-virtual call is not any different from a normal function call and can be optimized in the same way. — Esailija, Jun 02 '14 at 10:14
@Esailija, but that would mean that optimizer would have to track functions calls to know which functions are suspects for such optimization. is there an 'acknowledged' strategy for doing that? — ironic, Jun 02 '14 at 10:19
@ironic yes, it's called [inline caching](http://en.wikipedia.org/wiki/Inline_caching) which is also the backbone of dynamic language performance. See also https://wikis.oracle.com/display/HotSpotInternals/PerformanceTechniques#PerformanceTechniques-Methods — Esailija, Jun 02 '14 at 10:21
I believe that holding 'opinion-base-generator' question is needed to avoid flaming. There is really not so much opinions about this question so far, but holding it prevents someone from posting a nice answer - I still hope that there is a person who has done statistical investigation on this subject. Also, as 'scream from my heart': there are tons of extremely general questions here on SOF, but why this one is bad for you??!! — ironic, Jun 02 '14 at 10:30
Yes I was disappointed about the closure too (especially about "opinion based" which is ridiculously wrong), I had written my answer already :| — Esailija, Jun 02 '14 at 10:33
Is this answer about OOP in general, or virtualized OO languages only? Because I see a definite bias in the comments so far... — DevSolar, Jun 02 '14 at 11:21
My universal response to this kind of question is: 1) everything depends on the proportion of time spent doing the activity in question, and 2) it only makes sense to ask this in comparison to an alternative that does the same job. More specifically, in my experience, unless OOP is used with great discipline, it encourages people to make excessively complex designs, which for that reason suffer in performance. — Mike Dunlavey, Jun 02 '14 at 11:41

Performance: OOP vs processor cache and pipeline

0 Answers0