1

I used Eigen to calculate inner product of two matrix, the first one is A=(BC).eval() and second one is D=(EF).eval(). Here B,C,E,F are the same size (1500 * 1500) but with different values. I find the first one cost about 200 ms while the second one cost about 6000 ms, I have no idea why this happened.

#include <iostream>
#include <time.h>

#include "Eigen/Dense"

int main() {
    clock_t start, stop;

    Eigen::MatrixXf mat_a(1200, 1500);
    Eigen::MatrixXf mat_b(1500, 1500);
    Eigen::MatrixXf mat_r(1000, 1300);
    int i, j;
    float c = 0;
    for (i = 0; i < 1200; i++) {
        for (j = 0; j < 1500; j++) {
            mat_a(i, j) = (float)(c/3 * 1.0e-40);
            //if (i % 2 == 0 && j % 2 == 0) mat_a(i, j);
            c++;
        }
    }
    //std::cout << mat_a.row(0) << std::endl;
    c = 100;
    for (i = 0; i < 1500; i++) {
        for (j = 0; j < 1500; j++) {
            mat_b(i, j) = (float)(c/3 * 0.5e-10);
            c++;
        }
    }
    //std::cout << mat_b.row(0) << std::endl;

    start = clock();
    mat_r = mat_a * mat_b;
    stop = clock();
    std::cout << stop - start << std::endl;
    getchar();
    return 0;
}

as show in above example code. I find this is caused by the value of the matrix, when mat_a has value about e-40 and mat_b has value about e-10, this problem occurs stably.

Is there anyone who can explain it?

  • 1
    Please provide a bit more details, it could be that E or F contains NaN/Inf values killing performance or that your way of benchmarking is flawed. – ggael Jun 28 '17 at 08:54
  • I just added some code to reproduce this problem. b.t.w., there is no NaN/INF problem. – Jingyong Hou Jun 29 '17 at 02:40

1 Answers1

0

This is because your matrix contains denormal numbers that are slow to deal with for the CPU. You should make sure that you are using reasonable units so that those can be considered as zeros, and then enable the flush-to-zero (FTZ) and denormals-as-zero flags (DAZ), for instance using the fast-math mode of your compiler or at runtime, see this SO question.

ggael
  • 28,425
  • 2
  • 65
  • 71