You can write a much faster brute-force implementation.
Firstly, you can use a matrix multiplication rather than many dot product by working on chunks of permutations. Matrix multiplication kernel are heavily optimized and thus run much faster than many dot product.
Moreover, you can partially pre-compute the permutations to speed up the computation even more by splitting permutations in two parts. The idea is to first build an index which contains all the permutation consisting in picking 5 element among the 12. Then, the idea is to find all the permutation of an array of 7 item (the indices and not the values themselves). Finally, all the permutations can be build from the two index.
Note that further optimizations are possible when the two above optimizations are applied together: one can compute the matrix multiplication more efficiently if one part of the permutation is constant.
The resulting algorithm is complex but much more efficient then the original one. Here is the code:
def computeOptim(A):
mini = 1000000
permValues = np.array([2433, 2057, 1935, 1927, 1870, 1841, 1818, 1770, 1680, 1497, 1435, 1289])
# Precompute partial permutations: high and low part of all the permutations.
loPerms = np.array(list(itertools.permutations(range(7))))
hiPerms = np.array(list(itertools.permutations(range(12), 5)))
# Iterate over chunks (of 7!=5040 permutations)
for hiPerm in hiPerms:
# Find the remaining index to include in the low-part permutations
loPermIndices = np.array(list(set(range(12))-set(hiPerm)))
# Find all the possible low-part permutations for the current
# high-part permutation by reindexing the values.
curLoPerms = loPermIndices[loPerms]
# Compute the chunks of possible x values
loPermValues = permValues[curLoPerms]
hiPermValues = permValues[hiPerm]
# A matrix multiplcation is used to compute many dot product in a row.
# Compute effciently B = A @ X with X the matrix containing all the permutations
hiB = A[:,:len(hiPermValues)] @ hiPermValues[None,:].T
loB = A[:,len(hiPermValues):] @ loPermValues.T
B = hiB + loB
multiCur = np.std(B, axis=0)
minPos = np.argmin(multiCur)
if multiCur[0,minPos] < mini:
mini = multiCur[0,minPos]
res = np.concatenate((hiPermValues, loPermValues[minPos]))
A = np.matrix('0,3,1,2,2,2,2,2,2,1,2,2;3,0,3,1,2,3,1,1,2,2,1,2;1,3,0,2,2,2,2,2,2,1,2,2;2,1,2,0,2,1,2,2,2,2,3,2;2,2,2,2,0,2,1,2,2,3,2,1;2,3,2,1,2,0,3,2,1,2,1,2;2,1,2,2,1,3,0,2,1,1,3,3;2,1,2,2,2,2,2,0,3,2,2,1;2,2,2,2,2,1,1,3,0,3,1,2;1,2,1,2,3,2,1,2,3,0,2,2;2,1,2,3,2,1,3,2,1,2,0,2;2,2,2,2,1,2,3,1,2,2,2,0')
computeOptim(A)
It succeed to find the optimal solution in 50 seconds on my machine while the original code would take roughly 5h30. As a result this code is about 400 faster.
The optimal solution found is:
mini = 291.80729942892106
res = [2433 1841 1289 1818 2057 1927 1770 1870 1497 1680 1935 1435]