Yes, this is common. For example, bignums multiplication can be done in several ways:
- naive
O(N^2)
- Karatsuba
O((N^1.585)
- Schönhage–Strassen
O(N*log(N)*(log(log(N))))
- and there are also more advanced slightly faster algorithms now
So based on input variables used, the bitwidth fastest version is used (I use this for performance-critical "low-level" math operations on bignums all the time, because they are used as a building block for higher operations and without it it would not work with optimal speed on "whole/practical" number ranges).
However, the thresholds depends on computing hardware architecture, used compiler and sometimes even code usage, so they might differ on a per computer basis, so to ensure best performance, the thresholds are sometimes measured at program startup or configuration phase instead of using hardcoded values.
This is usually used on functions that has a huge variety of input sizes and also not on trivial functions, because the initial if
statements that selects between algorithms is also performance hit (branch). In some "rare" cases I think you can do this branchlessly. For example, if the input size is also the input parameter of a template or maybe even a macro, for example, like in here (C++):
double matrix<1>::det() { return a[0][0]; }
double matrix<2>::det() { return (a[0][0]*a[1][1])-(a[0][1]*a[1][0]); }
template <DWORD N> double matrix<N>::det()
{
double d=0.0; int j;
matrix<N-1> m;
for (j=0;j<N;j++)
{
m=submatrix(0,j);
if (int(j&1)==0) d+=a[0][j]*m.det();
else d-=a[0][j]*m.det();
}
return d;
}
As you can see, there are three different methods for the determinant based on input size, but no branches for the selection. However, only hardcoded thresholds can be used.
You can also achieve this with function pointers, for example, (C++):
int algoritm1(int x){ return 10; }
int algoritm2(int x){ return 20; }
int algoritm3(int x){ return 30; }
int (*functions[4])(int x)={ algoritm1,algoritm2,algoritm3,algoritm3 };
int function(int x)
{
return functions[(x>>10)&3](x);
}
So for x
up to 1024 using algoritm1, up to 2048 using algorithm2 and for the rest, algorithm3 without any branches too. Here you can have dynamic thresholds (not that flexible, but still usable), so you can sample your x range by some power of 2 (like I did 2^10=1024) and just use duplicates. So, for example, if rounded thresholds are 3*1024
and 5*1024
, you can do this:
int algoritm1(int x){ return 10; }
int algoritm2(int x){ return 20; }
int algoritm3(int x){ return 30; }
int (*functions[8])(int x)=
{
algoritm1,
algoritm1,
algoritm1,
algoritm2,
algoritm2,
algoritm3,
algoritm3,
algoritm3
};
int function(int x)
{
return functions[(x>>10)&7](x);
}
So you can create the function pointers array based on measured threshold at runtime too with this approach...