minimal multiplications vs a set-cover issue

Question

I have a set I ={P1, P2, ..., Pm} , and n finite subsets of I, denoted by R1,R2,...,Rn as follows:

R1 = {P1, P2}

R2 = {P2, P4}

R3 = {P2, P3, P4}

R4 = {P1, P2, P4}

....

where Pi denotes an integer.

For each Ri, I want to compute the product of all its elements. My objective is to use the multilications and divisions as fewer as possible by sharing some common parts among Ri (i=1,2,...,n).

For example, if I can compute P2*P4 first, then this result can be used in computing the product of all elements for R3 and R4.

Note that division is also allowed. For example, I can compute A=P1*P2*P3*P4 first, then use A/P1 to compute the product of all elements for R3, and use A/P3 for R4.

If I want to use the minimal multiplications and divisions to compute all the products for every subset of I, is it a set-cover problem? NP-complete ? BTW, could you give a Integer linear program formulation to describe it just as here .

Any suggestions will be highly appreciated!

community edit: Addition assumptions:

divisions are allowed, same cost as multiplications
no repeated elements in a given set. e.g. R5 = {P1, P1, P1, P2} is disallowed

Yeah, it is allowed. For example, I can compute A=P1*P2*P3*P4 first, then use A/P1 to compute the products of all elements for R3, and use A/P3 for R4. — John Smith, May 30 '12 at 16:34
Interesting problem. It seems like there ought to have been a standard solution worked out decades ago, but the closest I can find is articles on optimizing the order of matrix multiplications. Even without division, I don't think this is set covering (at least naively), because P1*P2 and P2*P4 together cover R4, but don't yield the right product. — Ted Hopp, May 30 '12 at 16:38
Sure! Take each of them as an operation, requiring one flop. — John Smith, May 30 '12 at 16:47
Do you consider the time to calculate the minimal operations as part of the cost, or do you not care (e.g. precomputable)? Are you assuming parallel or non-parallel hardware? — ninjagecko, May 30 '12 at 16:58
Currently, this issue is based on no-parallel hardware. Firstly, regardless of time, I want to minimize the times of multplications and divisions. The next step, if possible, I will be interested in minimize the time (e.g. precompute). They are two closely related issues. — John Smith, May 30 '12 at 17:02
The problem is much more difficult if Pi are not prime numbers . Are the Pis prime numbers ? — Ricky Bobby, May 30 '12 at 17:36

Ricky Bobby · Answer 1 · 2012-05-31T08:05:16.173

1

First, you don't need division

The division is not needed here, it will always be an extra calculation.

If you need to divide it means that you already made the multiplaction, therefore if you divide you undo a work you already done.

For example if you calculate Pi*Pj*Pk by dividing Pi*Pj*Pk*Pn by Pn, it means you calculated Pi*Pj*Pk*Pn before and therefore you calculated Pi*Pj*Pk before.

(I'am assuming all your number are prime numbers)

my solution

I have an idea which doesn't take into account the possibility of dividing.

You can start building a suffix tree with your Ri .

Then the number of multiplication would be the number of edges in the tree.

Example :

Using your example the suffix tree would be :

You get 4 multiplication :

M1=P1*P2

M2=P2*P4

M3=M2*P1

M4=M2*P3

Finding the minimum number of multiplication is equivalent to building the suffix tree with the minimum number of edges.

Hope this could help.

edited May 31 '12 at 08:05

answered May 30 '12 at 17:26

Ricky Bobby

7,490
7
46
63

I should explain -1: *"For example if you calculate Pi*Pj*Pk by dividing Pi*Pj*Pk*Pn by Pn, it means you calculated Pi*Pj*Pk*Pn before and therefore you calculated Pi*Pj*Pk before."* - This does not mean you calculated Pi*Pj*Pk before. It means you calculated a factor of `Pi*Pj*Pk*Pn` before. For example: `Pi*Pj` and `Pk*Pn`. This is not to say that divisions aren't necessary, just that the above is not a proof. – ninjagecko May 31 '12 at 15:27
@ninjagecko Thks, for your comment. I will try to update my proof. I am pretty sure division are not necessary here. (unless you have a counter example) And suffix trees seems like an interesting approach to me. – Ricky Bobby May 31 '12 at 16:03
@RickyBobby, does the assumption of prime numbers here mean that Pi is atomic, which cannot be decomposed into the product of two factors? – John Smith Jun 05 '12 at 00:44
@JohnSmith, yes it means that Pi cannot be decomposed into the product of 2 Pk and Pj. For example if R1={60} and R2 = {2,3,5} then you can calculate the product of element in R2 with only one division : 60/2 instead of 2 multiplications : 2*3*5. Assuming that the numbers are not all prime numbers adds a lot of complexity to your problem. – Ricky Bobby Jun 05 '12 at 14:52
@RickyBobby, how to build a suffix tree with the minimum number of edges? Is there any refs ? – John Smith Sep 06 '12 at 03:14

ninjagecko · Accepted Answer · 2012-05-30T18:29:27.083

Consider a graph of your elements R_i, with no edges. We now allow ourselves to add edges as follows:

Add a directed edge between R_a→R_b, annotated with the quotient R_b/R_a.

For example, we can draw an edge R₁→R₃, with a cost of multiplying by R₁/R₃ = P3*P4/P1

Do this for all nodes, so you have |R|² edges.

Now if you only used intermediate results, you could use a Minimum Spanning Tree algorithm to solve this. I believe the MST algorithms are very close to linear in the number of edges (a factor of Inverse Ackermann, grows slower than log(log(log(...log(n)...)))); there may even be randomized linear time algorithms in the number of edges, e.g. http://www.akira.ruc.dk/~keld/teaching/algoritmedesign_f04/Artikler/07/Karger95.pdf Thus this basic algorithm will take |R|² time.

However, you deprive yourself of truly optimal answers if you only use intermediate results, because it's possible to use "extemporary" expressions that we pull out of thin air to achieve better performance. For example you might consider this scenario:

R1 = {P2, P3, P4, P5}
R2 = {P1, P3, P4, P5}
R3 = {P1, P2, P4, P5}
R4 = {P1, P2, P3, P5}
R5 = {P1, P2, P3, P4}

The optimal solution is to calculate P1*P2*P3*P4*P5, then divide by P_i, resulting in 9 operations. Whereas the above method would only do something like R1=P2*P3*P4*P5, then perform a multiplication and division each time to go R1→R2, R2→R3, R3→R4, R4→R5, resulting in 11 operations. (If we did not have division, we would also have 9 operations: b=P1*P2 c=b*P3 r5=c*P4 r4=c*P5, d=P4*P5, r3=b*d, e=P3*d, r1=e*P2, r2=e*P1. Though I think it would be possible to create situations where division was necessary, cannot seem to find one though; if I cannot find one, it might be the case that this is actually a simple polynomial-time problem.)

This method I will refer to as the "extemporary expression" method. If we choose to calculate an extemporary expression (a single sunk cost at the beginning), this will reduce the costs of all future calculations to traverse an edge if we choose to use it (because there might be a choice of how many extemporary expressions to use).

This brings us to a very minor sidenote:

Theorem:
  You should have at least one intermediate expression of each 
  length up to the maximum size of any R.
Proof (induction):
  To build up a product of size N, you will need to do 
  have a product of size N-1.

Noting this theorem, it turns out we were slightly wrong above. The optimal solution was to remember P1*P2*P3 and P1*P2*P3*P4 in the process of calculating P1*P2*P3*P4*P5. Then we can get R5 for free (and R4 with just one multiplication via another means, unfortunately the same cost as before), reducing the total cost from 9 to 8. This leads us to a wild guess that performing http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm with arbitrary edges might also yield the optimal solution, after a very long time.

Anyway, how do we incorporate such a system into our solution? We can't add a node, because that would mess up the MST algorithm. For each edge where multiply or dividing by the extemporary expression E would not make some P have more than a power p (e.g. for p=2 we allow intermediate extemporary expressions which create products like P1 * P4^2 / P3 but disallow something like P2^3), we perform that multiplication and division on the edge, and create two new edges. (We might do this more than once, or at future dates.) We also do this for all subsets of the edge, which we'd memoize like above. The cost of this method, if you use the MST algorithm, is that the number of edges increases heavily, so maybe (|R| + #newedges)² = (|R|^|P|)² maybe in the worse case, significantly increasing the time it takes to find an optimal answer.

It thus seems like the more general problem is NP-hard, as a guess; please someone chime in if this is not the case. However you can perhaps use a heuristic for guessing useful extemporary expressions to use. For example, if you see a "large" subset of R with "a high density of common Ps", you might generate the tricknode that is the product of all the common Ps. If you do this for all "large/dense clumps of common Ps" you see among subsets of your Rs, then run the MST, you might get slightly better answers, without having to do the more general search. You could even run an algorithm to help detect such clumps, such as a hierarchical clustering algorithm.

(sidenote: You might also be able to apply math about lattices to this problem, since you can think of each set as a bit-vector, which together form the basis of a lattice.)

I should comment on my answer: after some experimentation, I have been unable to find a situation where the greedy algorithm doesn't seem to give the optimal result, and without division. The greedy algorithm is as follows: [...] — ninjagecko, Jun 03 '12 at 22:29
[...] Keep a cache of products. Build up new products as follows: consider all "paths" from each product in cache to each R. Consider how each term in the path will reduce your problem size (e.g. consider a*b->a*b*c->a*b*c*d: rewrite each R by replacing a*b*c*d->x0, a*b*c->x1, calc new cost, compare amongst all paths). Replace each R. This is an elaboration of what I referred to as a "heuristic" in my answer. — ninjagecko, Jun 03 '12 at 22:29
excellent suggestions! Just wondering is the MST method near-optimal ? — John Smith, Jun 04 '12 at 16:14

score 1 · Answer 3 · answered May 31 '12 at 00:04

1

Without divisions, this appears to be equivalent to the problem ENSEMBLE COMPUTATION as described in Gary & Johnson and hence NP-complete.

[PO9] ENSEMBLE COMPUTATION

INSTANCE: Collection C of subsets of a finite set A, positive integer J.

QUESTION: Is there a sequence S = (z_1 <- x_1 U y_1, z_2 <- x_2 U y_2, ..., z_j <- x_j U y_j) of j <= J union operations, where each x_i and y_i is either {a} for some a in A or z_k for some k < i, such that x_i and y_i are disjoint, 1 <= i <= j, and such that for every subset c in C there exists some z_i, 1 <= i <= j, that is identical to C.

Your set I corresponds to A, each R_i corresponds to an element of C, and multiplication corresponds to set union.

answered May 31 '12 at 00:04

mhum

2,928
1
16
11

What is J in ENSEMBLE COMPUTING corresponding to my problem ? – John Smith Jun 04 '12 at 16:17
J is the maximum number of allowed multiplications. Lower-case j is the number of actual multiplications. It is natural to turn optimization problems into decision problems by imposing a bound on the thing to be optimized, e.g.: turning "minimize the number of multiplications" into "can I do this in fewer than J multiplications?". – mhum Jun 04 '12 at 18:51
"R_i corresponds to an element of C"? I think it should be "the product of all elements in R_i corresponds to an element of C". Could u please confirm that ? Thanks! – John Smith Jun 06 '12 at 01:21
Yeah, that's more or less the same thing. The way the problem is set up, we don't necessarily need to differentiate between the set R_i and the product of all the elements in R_i since we could have just selected each P_i to be distinct primes. – mhum Jun 06 '12 at 02:31
thanks. BTW, what do you think of the approximation radio of Ninjagecko's algorithm? – John Smith Jun 07 '12 at 02:48

minimal multiplications vs a set-cover issue

3 Answers3