I'm performing a decently complex operation on some 3- and 4-dimensional tensor using numpy einsum.
My actual code is
np.einsum('oij,imj,mjkn,lnk,plk->op',phi,B,Suu,B,phi)
This does what I want it to do.
Using einsum_path, the result is:
>>> path = np.einsum_path('oij,imj,mjkn,lnk,plk->op',phi,B,Suu,B,phi)
>>> print(path[0])
['einsum_path', (0, 1), (0, 3), (0, 1), (0, 1)]
>>> print(path[1])
Complete contraction: oij,imj,mjkn,lnk,plk->op
Naive scaling: 8
Optimized scaling: 5
Naive FLOP count: 2.668e+07
Optimized FLOP count: 1.340e+05
Theoretical speedup: 199.136
Largest intermediate: 7.700e+02 elements
--------------------------------------------------------------------------
scaling current remaining
--------------------------------------------------------------------------
4 imj,oij->moj mjkn,lnk,plk,moj->op
5 moj,mjkn->nok lnk,plk,nok->op
4 plk,lnk->npk nok,npk->op
4 npk,nok->op op->op
This indicates a theoretical speedup of about 200x.
How can I use this result to speed up my code? How do I "implement" what einsum_path is telling me?