0

Suppose I have Q vectors as follows:

A_1 = [x_1,x_2,...,x_m];
A_2 = [y_1,y_2,...,y_n];
.
.
.
A_Q = [z_1,z_2,...,z_q];

Basically, m,n,q are positive integers and not necessarily equivalent. Each vector element x,y,z is a number between (0,1). One possible combination of these vectors can be:

B_1 = [x_1,y_1,...,z_1];

In each combination, exactly one element of each vector should be included. Therefore, we can have m*n*...*q different combinations. Now, I want to find those combinations such that mean of their elements equals to a specific number W, i.e, mean(B_1)=W. Note that W is a number between (0,1). An inefficient way is to compute all permutations and then compare W with the mean of their elements. However, as Q,m,n,q increase, the computer (MATLAB) can't handle it, which is obvious.

Would you please help me find an efficient approach to solve this problem in a reasonable time?

One can say that there may not be a vector, where the mean of its elements equals to W. Yes, that's true. Then, is it possible to find combinations such that the mean of their elements is between (W-T,W+T), where T is between (0,1)?

Many thanks in advance.

user1952500
  • 6,611
  • 3
  • 24
  • 37
Armin
  • 11
  • 1
  • 1
  • 3
  • do you need at most one element from each vector in the result, or exactly one ? – gdelab Apr 07 '17 at 13:35
  • Iff your combinations consists of exactly one element from each vector, you can exclude all elements with values `> Q * T`. – Paul Brodersen Apr 07 '17 at 13:41
  • Also, are your elements values continuous or discrete? – Paul Brodersen Apr 07 '17 at 13:43
  • @gdelab, exactly one. I edited the text. – Armin Apr 07 '17 at 14:20
  • @Paul, if `Q=100`, and `T=0.01`, then, `Q*T=1`. But, there is no element greater than `1` since all elements are between `(0,1)`. Also, another point is how to choose `T`. Am I right? BTW, elements values are continuous. – Armin Apr 07 '17 at 14:25
  • what is the range of `m,n,q` we talking about ? – Some Guy Apr 07 '17 at 14:48
  • @SomeGuy, `m,n,q` are integer and `>1`. I mean they can have different arbitrary values over time in my problem. – Armin Apr 07 '17 at 14:51
  • 1
    "`m,n,q` are integer and `>1`" That doesn't help us understand the potential size of the problem. Have you tried [this approach](http://stackoverflow.com/a/21895344/1377097)? We can generate the combinations iteratively, but it's much slower. – beaker Apr 07 '17 at 15:12
  • You could improve a little bit from the brute force by starting from both sides: you get all (Q/2)-tuples from the Q/2 first vectors, compute their sums, you do the same with the last Q/2 vectors, and then you check in linear time (if they are sorted, which you can probably arrange), if not you can sort them then; linear in their size, so something like O(n^(Q/2) ) if some or all tuples work... It's still very long, but about square root of the simple brute force (which is already very much faster if the data is big) – gdelab Apr 07 '17 at 15:13
  • @beaker, Yes I tried before. It doesn't work if we have, for example, 100 vectors and each with 2 elements. Number of combinations is `2^100` and it's not possible to handle it. – Armin Apr 07 '17 at 15:28
  • Okay, that at least gives me *some* idea of the magnitudes. I can suggest an iterative approach that backtracks when it reaches impossible configurations, but I'm afraid it will be terribly slow. Let me try some things out and I'll get back to you. – beaker Apr 07 '17 at 15:34
  • Sorry, I tried stripping down my code as much as I could and I'm still getting execution times of several minutes per million combinations. I'm using Octave, so your timings may faster, but that's still a long time to wait. Is there any way you can avoid having to exhaustively generate all combinations? – beaker Apr 07 '17 at 19:00

0 Answers0