0

Say I have multiple different vectors of the same length

Example:

1: [1, 2, 3, 4]
2: [5, 6, 7, 8]
3: [3, 8, 9, 10]
4: [6, 9, 12, 3]

And I want to figure out the optimal integer coefficients for these vectors such that the sum of the vectors is closest to a respective specified goal vector.

Goal Vector: [55,101,115,60]

Assuming the combination only involves adding arrays together (no subtraction), how would I go about doing this? Are there any Python libraries (numpy, scikit, etc.) that would help me do this? I suspect that it is a linear algebra solution.

Example Combination Answer: [3, 3, 3, 1, 2, 4, 1, 1, 1, 2, 3, 4] where each of the values are one of those arrays. (This is just a random example)

wuerfelfreak
  • 2,363
  • 1
  • 14
  • 29
rkatakam
  • 51
  • 6
  • 1
    I would frame this as a vector addition problem. My advice would be to start with the same problem in one dimension, then two... That should give you some intuition for the problem which will make the way to code this much clearer. I certainly wouldn't jump straight into trying to code this. It'll also help you better define the problem. For example, is only addition allowed, or can an "vector" be subtracted? – SimonR Jan 02 '21 at 04:04
  • @SimonR Thank you for the response. I am assuming that only addition is allowed in this case. – rkatakam Jan 03 '21 at 06:42

1 Answers1

1

You could write your problem as a system of linear-equations:

arr1[0] + b*arr2[0] + c*arr3[0] + d*arr4[0] = res[0]
a*arr1[1] + b*arr2[1] + c*arr3[1] + d*arr4[1] = res[1]
a*arr1[2] + b*arr2[2] + c*arr3[2] + d*arr4[2] = res[2]
a*arr1[3] + b*arr2[3] + c*arr3[3] + d*arr4[3] = res[3]
#For all positive a,b,c,d.

Which you could then solve, if there is an exact solution.

If there is no exact solution, there is a scipy method to calculate the non-negative least squares solution to a linear matrix equation called scipy.optimize.nnls.

from scipy import optimize
import numpy as np

arr1 = [1, 2, 3, 4]
arr2 =  [5, 6, 7, 8]
arr3 = [3, 8, 9, 10]
arr4 = [6, 9, 12, 3] 

res = [55,101,115,60]

a = np.array([
    [arr1[0], arr2[0], arr3[0], arr4[0]],
    [arr1[1], arr2[1], arr3[1], arr4[1]],
    [arr1[2], arr2[2], arr3[2], arr4[2]],
    [arr1[3], arr2[3], arr3[3], arr4[3]]
])

solution,_ = optimize.nnls(a,res)

print(solution)

print('Coefficients before Rounding', solution)
solution = solution.round()
print('Coefficients after Rounding', solution)
print('Resuls', [arr1[i]*solution[0] + arr2[i]*solution[1] + arr3[i]*solution[2] + arr4[i]*solution[3] for i in range(4)])

This would print

Coefficients before Rounding [0.         0.1915493  3.83943662 6.98826291]
Coefficients after Rounding [0. 0. 4. 7.]
Resuls [54.0, 95.0, 120.0, 61.0]

Pretty close, isn't it?

It could indeed happen that this is not the perfect solution. But as discussed in this thread "integer problems are not even simple to solve" (@seberg)

wuerfelfreak
  • 2,363
  • 1
  • 14
  • 29