I have found on the internet (here and here), that the inheritance doesn't affect the performance of the class. I have become curious about that as I have been writing a matrices module for a render engine, and the speed of this module is very important for me.
After I have written:
- Base: general matrix class
- Derived from the base: square implementation
- Derived from derived: 3-dim and 4-dim implementations of the square matrix
I decided to test them and faced performance issues with instantiation
And so the main questions are:
- What's the reason of these performance issues in my case and why may they happen in general?
- Should I forget about inheritance in such cases?
This is how these classes look like in general:
template <class t>
class Matrix
{
protected:
union {
struct
{
unsigned int w, h;
};
struct
{
unsigned int n, m;
};
};
/** Changes flow of accessing `v` array members */
bool transposed;
/** Matrix values array */
t* v;
public:
~Matrix() {
delete[] v;
};
Matrix() : v{}, transposed(false) {};
// Copy
Matrix(const Matrix<t>& m) : w(m.w), h(m.h), transposed(m.transposed) {
v = new t[m.w * m.h];
for (unsigned i = 0; i < m.g_length(); i++)
v[i] = m.g_v()[i];
};
// Constructor from array
Matrix(unsigned _w, unsigned _h, t _v[], bool _transposed = false) : w(_w), h(_h), transposed(_transposed) {
v = new t[_w * _h];
for (unsigned i = 0; i < _w * _h; i++)
v[i] = _v[i];
};
/** Gets matrix array */
inline t* g_v() const { return v; }
/** Gets matrix values array size */
inline unsigned g_length() const { return w * h; }
// Other constructors, operators, and methods.
}
template<class t>
class SquareMatrix : public Matrix<t> {
public:
SquareMatrix() : Matrix<t>() {};
SquareMatrix(const Matrix<t>& m) : Matrix<t>(m) {};
SquareMatrix(unsigned _s, t _v[], bool _transpose) : Matrix<t>(_s, _s, _v, _transpose) {};
// Others...
}
template<class t>
class Matrix4 : public SquareMatrix<t> {
public:
Matrix4() : SquareMatrix<t>() {};
Matrix4(const Matrix<t>& m) : SquareMatrix<t>(m) {}
Matrix4(t _v[16], bool _transpose) : SquareMatrix<t>(4, _v, _transpose) {};
// Others...
}
To conduct tests I used this
void test(std::ofstream& f, char delim, std::function<void(void)> callback) {
auto t1 = std::chrono::high_resolution_clock::now();
callback();
auto t2 = std::chrono::high_resolution_clock::now();
f << std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count() << delim;
//std::cout << "test took " << std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count() << " microseconds\n";
}
Performance problems
With single class initialization, there're no problems - it goes under 5 microseconds almost every time for every class. But then I decided to scale up the number of initializations and their several troubles occurred
I ran every test 100 times, with arrays of length 500
1. Class initialization with the default constructor
I just tested the initialization of arrays
The results were (avg time in microseconds):
- Matrix 25.19
- SquareMatrix 40.37 (37.60% loss)
- Matrix4 58.06 (30.47% loss from SquareMatrix)
And here we can already see a huge difference
Here's the code
int main(int argc, char** argv)
{
std::ofstream f("test.csv");
f << "Matrix\t" << "SquareMatrix\t" << "Matrix4\n";
for (int k = 0; k < 100; k++) {
test(f, '\t', []() {
Matrix<long double>* a = new Matrix<long double>[500];
});
test(f, '\t', []() {
SquareMatrix<long double>* a = new SquareMatrix<long double>[500];
});
test(f, '\n', []() {
Matrix4<long double>* a = new Matrix4<long double>[500];
});
}
f.close();
return 0;
}
2. Class initialization with default constructor and filling
Tested the initialization of arrays of class instances and filling them after with custom matrices
The results (avg time in microseconds):
- Matrix 402.8
- SquareMatrix 475 (15.20% loss)
- Matrix4 593.86 (20.01% loss from SquareMatrix)
Code
int main(int argc, char** argv)
{
long double arr[16] = {
1, 2, 3, 4,
5, 6, 7, 8,
9, 10, 11, 12,
13, 14,15,16
};
std::ofstream f("test.csv");
f << "Matrix\t" << "SquareMatrix\t" << "Matrix4\n";
for (int k = 0; k < 100; k++) {
test(f, '\t', [&arr]() {
Matrix<long double>* a = new Matrix<long double>[500];
for (int i = 0; i < 500; i++)
a[i] = Matrix<long double>(4, 4, arr);
});
test(f, '\t', [&arr]() {
SquareMatrix<long double>* a = new SquareMatrix<long double>[500];
for (int i = 0; i < 500; i++)
a[i] = SquareMatrix<long double>(4, arr);
});
test(f, '\n', [&arr]() {
Matrix4<long double>* a = new Matrix4<long double>[500];
for (int i = 0; i < 500; i++)
a[i] = Matrix4<long double>(arr);
});
}
f.close();
return 0;
}
3. Filling vector with class instances
Pushed back custom matrices to vector
The results (avg time in microseconds):
- Matrix 4498.1
- SquareMatrix 4693.93 (4.17% loss)
- Matrix4 4960.12 (5.37% loss from its SquareMatrix)
Code
int main(int argc, char** argv)
{
long double arr[16] = {
1, 2, 3, 4,
5, 6, 7, 8,
9, 10, 11, 12,
13, 14,15,16
};
std::ofstream f("test.csv");
f << "Matrix\t" << "SquareMatrix\t" << "Matrix4\n";
for (int k = 0; k < 100; k++) {
test(f, '\t', [&arr]() {
std::vector<Matrix<long double>> a = std::vector<Matrix<long double>>();
for (int i = 0; i < 500; i++)
a.push_back(Matrix<long double>(4, 4, arr));
});
test(f, '\t', [&arr]() {
std::vector<SquareMatrix<long double>> a = std::vector<SquareMatrix<long double>>();
for (int i = 0; i < 500; i++)
a.push_back(SquareMatrix<long double>(4, arr));
});
test(f, '\n', [&arr]() {
std::vector<Matrix4<long double>> a = std::vector<Matrix4<long double>>();
for (int i = 0; i < 500; i++)
a.push_back(Matrix4<long double>(arr));
});
}
f.close();
return 0;
}
If you need all the source code, you can look here into matrix.h
and matrix.cpp