My questions is basically completely stated in the title, however let me elaborate.
Question:
Maybe worth of rephrasing, how complicated/simple the virtual
method has to be, to make the mechanism a considerable overhead? Are there any rules of thumb for this? E.g. If it takes 10 minutes, uses I/O, complex if
statements, memory operations etc. it's not a problem. Or, if you write virtual get_r() { return sqrt( x*x + y*y); };
and call it in a loop you will have troubles.
I hope the question is not too general as I seek some general but concrete technical answers. Either its hard/impossible to tell, or virtual calls take so much time/cycles resources, and math takes this, I/O this.
Maybe some technical people know some general numbers to compare or did some analysis and can share general conclusions. Embarassingly I dunno how to make those fancy asm
analysis =/.
I would also like to give some rationale behind it, as well as my use-case.
I think I saw more than few questions with people refraining from using virtuals like open fire in the forest during drought, for the sake of performance, and as many individuals asking them "Are you absolutely sure that virtual overhead is really an issue in your case?".
In my recent work I ran into a problem which can be placed at both sides of the river, I believe.
Also bear in mind I do not ask how to improve implementation of interface. I believe I know how to do it. I'm asking if it's possible to tell when to do it, or which to choose right of the bat.
Use-case:
I run some simulations. I have a class which basically provides a run environment. There is a base class, and more than one derived class that define some different workflows. Base collects stuff as common logic and assigning I/O sources and sinks. Derivatives define particular workflows, more or less by implementing RunEnv::run()
. I think this is a valid design. Now let's imagine objects that are subjects of the workflow can be put in 2D or 3D plane. The workflows are common/interchangeable in both cases, so the objects we are working on can have common interface, although to very simple methods like Object::get_r()
. On top of that lets have some stat logger defined for the environment.
Originally I wanted to provide some code snippets but it ended up with 5 classes and 2-4 methods each i.e. wall of code
. I can post it on request but it would lengthen the question to the twice of current size.
Key points are: RunEnv::run()
is the main loop. Usually very long (5mins-5h). It provides basic time instrumentation, calls RunEnv::process_iteration()
and RunEnv::log_stats()
. All are virtual. Rationale is. I can derive the RunEnv
, redesign the run()
for example for different stop conditions. I can redesign process_iteration()
, for example to use multi-threading if I have to process a pool of objects, process them in various ways. Also different workflows will want to log different stats. RunEnv::log_stats()
is just a call that outputs already computed interesting stats into a std::ostream
. I guess using virtuals and has no real impact.
Now let's say the iteration works by calculating distance of objects to the origin. So we have as interface double Obj::get_r();
. Obj
are implementation for 2D and 3D cases. The getter is in both cases a simple math with 2-3 multiplications and additions.
I also experimented in different memory handling. E.g. sometimes coordinate data was stored in private variables and sometimes in shared pool, so even the get_x()
could be made virtual with implementation get_x(){return x;};
or get_x(){ return pool[my_num*dim+x_offset]; };
. Imagine calculating something with get_r(){ sqrt(get_x()*get_x() + get_y()*get_y()) ;};
. I suspect virtuality here would kill performance.