4

I have four arrays (set1, set2,...) of 3 arrays. E.g.

set1 = [array([1, 0, 0]), array([-1, 0, 0]), array([0, 1, 0]), ...]

I need to find how many combinations of vectors sum to zero. The simple way to solve this problem is:

for b1 in set1:
  for b2 in set2:
    for b3 in set3:
      for b4 in set4:
        if all(b1 + b2 + b3 + b4 == 0):
          count = count + 1

However, this goes as O(n^4), and based on the 3sum algorithm, I assume I can do O(n^3) and speed is quite important. Any leads on how to do this quickly in python?

Divakar
  • 218,885
  • 19
  • 262
  • 358
yakzo
  • 337
  • 3
  • 8

5 Answers5

2

How about this?

from numpy import array
def createset(min, max):
    xr = lambda: xrange(min, max)
    return [ array([x, y, z]) for x in xr() for y in xr() for z in xr() ]

set1 = createset(-3, 3)
set2 = createset(-2, 1)
set3 = createset(-4, 5)
set4 = createset(0, 2)

lookup = {}
for x in set1:
    for y in set2:
        key = tuple(x + y)
        if key not in lookup:
            lookup[key] = 0
        lookup[key] += 1

count = 0
for x in set3:
    for y in set4:
        key = tuple(-1 * (x + y))
        if key in lookup:
            count += lookup[key]

print count

The idea is to generate all sums of the first two sets. Then, you generate the sums of the last two sets and see if there is a key in a lookup table such that their sums are 0.

Aldehir
  • 2,025
  • 13
  • 10
1

It won't change the actual time complexity, but you can probably speed up those loops by a factor of a few hundred by telling Python to compile it as C code using e.g. Cython: http://cython.org/. You could also parallelize, since the code you've written is embarrassingly parallel. A good C compiler would automatically take advantage of this, but Python won't.

An algorithm achieving much better time complexity (O[n^2 log N]) is sketched out here: https://cs.stackexchange.com/questions/2973/generalised-3sum-k-sum-problem. I would probably just implement the described algorithm in Python and put Cython around it.

EDIT:

For flattened arrays, you can also do the operation you've sketched out as follows:

sum2 = np.add.outer(A,B)
sum3 = np.add.outer(sum2,C)
sum4 = np.add.outer(sum3,D)

sum4[i,j,k,l] is now A[i]+B[j]+C[k]+D[l]. The number of zero entries is

len(sum4) - np.count_nonzero(sum4)
Community
  • 1
  • 1
AGML
  • 890
  • 6
  • 18
1

Use numpy's meshgrid function:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html

You'll need to reshape your initial sets into 1-D, but that's no loss for this purpose.

set1 = set1.flatten() // etc

Then call meshgrid(). It'll give you 4 of 4-D arrays, one for each of your sets. Then just add:

a,b,c,d = np.meshgrid(set1, set2, set3, set4)
total = a+b+c+d 

Finally, count up the number of 0's in the total array:

count = len(total) - np.count_nonzero(sum)
Chad Kennedy
  • 1,716
  • 13
  • 16
1

Assuming the inputs are lists of 1D arrays, as listed in the sample data provided in the question, it seems you can use broadcasting after row-stacking the input lists, like so -

import numpy as np

s1 = np.row_stack((set1))
s2 = np.row_stack((set2))
s3 = np.row_stack((set3))
s4 = np.row_stack((set4))

sums = s4[None,None,None,:,:] + s3[None,None,:,None,:] + s2[None,:,None,None,:] + s1[:,None,None,None,:]
count = (sums.reshape(-1,s1.shape[1])==0).all(1).sum()

Sample run -

In [319]: set1 = [np.array([1, 0, 0]), np.array([-1, 0, 0]), np.array([0, 1, 0])]
     ...: set2 = [np.array([-1, 0, 0]), np.array([-1, 1, 0])]
     ...: set3 = [np.array([1, 0, 0]), np.array([-1, 0, 0]), np.array([0, 1, 0])]
     ...: set4 = [np.array([1, 0, 0]), np.array([-1, 0, 0]), np.array([0, 1, 0]), np.array([0, 1, 0])]
     ...: 

In [320]: count = 0
     ...: for b1 in set1:
     ...:   for b2 in set2:
     ...:     for b3 in set3:
     ...:       for b4 in set4:
     ...:         if all(b1 + b2 + b3 + b4 == 0):
     ...:           count = count + 1
     ...:           

In [321]: count
Out[321]: 3

In [322]: s1 = np.row_stack((set1))
     ...: s2 = np.row_stack((set2))
     ...: s3 = np.row_stack((set3))
     ...: s4 = np.row_stack((set4))
     ...: 
     ...: sums = s4[None,None,None,:,:] + s3[None,None,:,None,:] + s2[None,:,None,None,:] + s1[:,None,None,None,:]
     ...: count2 = (sums.reshape(-1,s1.shape[1])==0).all(1).sum()
     ...: 

In [323]: count2
Out[323]: 3
Divakar
  • 218,885
  • 19
  • 262
  • 358
0

You can use itertools.product and a generator expression within sum function:

from itertools import combinations
sum(1 for i in produt(set1,set2,set3,set4) if sum(i)==0)

This would be faster than your code but still is O(n4) for more speed you can get the product with Numpy instead of itertools.

Community
  • 1
  • 1
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • Probably faster than looping, but still O(n^4) – yakzo Sep 18 '15 at 19:48
  • [1, -1] has sum 0 but isn't == [0, 0] – yakzo Sep 18 '15 at 20:10
  • Perhaps you mean from itertools import product; len([i for i in product(set1,set2,set3) if sum(i)==0]) to get the number of combinations that sum to 0. –  Sep 18 '15 at 20:12
  • @user3615787 You have added the elements together in your code, if you want all the elements be 0 it just would be 1 combination – Mazdak Sep 18 '15 at 20:13
  • @TrisNefzger There is no need to use a list to store the result and wasted the memory when we can use a generator expression within sum! – Mazdak Sep 18 '15 at 20:14
  • @Kasramvd Correct, then used all(). In reality, the sets are around length 30. – yakzo Sep 18 '15 at 20:14
  • 1
    Right, sum(1 for i in product(set1,set2,set3) if sum(i)==0) works too. The real points are that it is necessary to import product not combinations and also to spell product correctly. As is the code you gave errors out. –  Sep 18 '15 at 20:23