Suppose I have two 5000 x 1000 matrices, A and B. Will octave compute trace(A*B')
efficiently, i.e. in a way that only requires 5000 inner products as opposed to 5000*5000 inner products most of which will not be used?
And, what if the argument to trace
is more complicated, i.e.: trace(A*B' + C*D')
? Does that change anything?