Let's say we have a "master" class with a method called "Bulk" to perform N interactions over a virtual method.
This virtual method may be overridden by many classes but only once. For performance reasons we have to minimize the cost of calling/vtable resolution as much as we can. (Example: ++10Gb network packet generation)
One of my ideas to resolve this was to make the method Bulk virtual and "somehow" force it to be recompiled on each derived class, so we could make only one VTABLE search instead of N and also get some improvements from inlining/SSE/etc. However, reading de ASM what I only get is a generic "Bulk" method that again searches in the vtable N times.
¿Do you know any way to force that method recompilation (without the need to copy-paste its code on each derived class of course) or any other way to reduce the calls ad VTABLE searches? I thought similar requirements should be asked frequently but I did not found anything...
Example code to play around:
master.hpp
#pragma once
#include <string>
class master
{
public:
virtual unsigned Bulk(unsigned n)
{
unsigned ret = 0;
for (int i = 0; i < 144; ++i)
ret += once();
return ret;
}
virtual unsigned once() = 0;
};
derived1.hpp
#pragma once
#include "master.hpp"
class derived1 final: public master
{
virtual inline unsigned once() final { return 7; }
};
derived2.hpp
#pragma once
#include "master.hpp"
class derived2 final: public master
{
virtual inline unsigned once() final { return 5; }
};
main.cpp
#include "derived1.hpp"
#include "derived2.hpp"
#include <iostream>
using namespace std;
int main()
{
derived1 d1;
derived2 d2;
cout << d1.Bulk(144) << endl;
cout << d2.Bulk(144) << endl;
return 0;
}
Compile cmd i'm using: g++ main.cpp -S -O3 --std=gnu++17
Compiled Bulk Loop:
movq 0(%rbp), %rax
movq %rbp, %rdi
call *8(%rax)
addl %eax, %r12d
subl $1, %ebx
jne .L2