edit: The orders-of-growth in this answer are needed in addition to the accepted answer in order to run CSE or matrix-chain multiplication
Interestingly, a compression algorithm may be what you want: a compression algorithm seeks to reduce the size of what it's compressing, and if the only way it can do that is substitution, you can trace it and obtain the necessary subcomponents for your algorithm. This may not give nice results though for small inputs.
What subsets of your operations are commutative will be an important consideration in choosing such an algorithm. [edit: OP says no operations are commutative in his/her situation]
We can also define an optimal solution, if we ignore effects such as caching:
input: [some product of matrices to compute]
given that multiplying two NxN matrices is O(N^2.376)
given we can visualize the product as follows:
[[AxB][BxC][CxD][DxE]...]
we must for example perform O(max(A,B,C)^2.376) or so operations in order to combine
[AxB][BxC] -> [AxC]
The max(...) is an estimate based on how fast it is to multiply two square matrices;
a better estimate of cost(A,B,C) for multiplying an AxB * BxC matrix can be gotten
from actually looking at the algorithm, or running benchmarks if you don't know the
algorithm used.
However note that multiplying the same matrix with itself, i.e. calculating
a power, can be much more efficient, and we also need to take that into account.
At worst, it takes log_2(power) multiplies each of O(N^2.376), but this could be
made more efficient by diagonalizing the matrix first.
There is the question about whether a greedy approach is feasible for not: whether one SHOULD compress repeating substrings at each step. This may not be the case, e.g.
aaaaabaab
compressing 'aa' results in ccabcb and compressing 'aab' is now impossible
However I have a hunch that, if we try all orders of compressing substrings, we will probably not run into this issue too often.
Thus having written down what we want (the costs) and considered possibly issues, we already have a brute-force algorithm which can do this, and it will run for very small numbers of matrices:
# pseudocode
def compress(problem, substring)
x = new Problem(problem)
x.string.replaceall(substring, newsymbol)
x.subcomputations += Subcomputation(newsymbol=substring)
def bestCompression(problem)
candidateCompressions = [compress(problem,substring) for each substring in problem.string]
# etc., recursively return problem with minimum cost
# dynamic programming may help make this more efficient, but one must watch
# out for the note above, how it may be hard to be greedy
Note: according to another answer by Asgeir, this is known as the Matrix Chain Multiplication optimization problem. Nick Fortescue notes this is also known more generally as http://en.wikipedia.org/wiki/Common_subexpression_elimination -- thus one could find any generic CSE or Matrix-Chain-Multiplication algorithm/library from the literature, and plug in the cost orders-of-magnitude I mentioned earlier (you will need those nomatter which solution you use). Note that the cost of the above calculations (multiplication, exponentiation, etc.) assume that they are being done efficiently with state-of-the-art algorithms; if this is not the case, replace the exponents with appropriate values which correspond to the way the operations will be carried out.