I have converted this python eimsum expression
psi_p = np.einsum('ij...,j...->i...', exp_p, psi_p)
to c++ like this:
int io=0;
`for (i=0; i < 4; i++){
ikauxop=i*nd;
for (j=0; j < 4; j++){
jkpsi=nd*j;
for (k=0; k < m_N; k++){
m_auxop[ikauxop+k] += m_opK [io++] * m_wf[jkpsi + k];
}
}
}
But in phyton is 2 times faster than in c++.
m_auxop and m_wf are 2d array flatten in 1D, and m_opK is a 3d array flatten in 1D, so I wonder who can I speed this in c++? `
The array types are std::complex, and I tried with flatten or not arrays and I get the same time