3

So far I have mylist = list(itertools.product(*a))

The problem with this is that it makes too many tuples. I want it not to make the tuple if the sum of all of tuple is > 4. eg

[(0, 0, 0, 0),
 (0, 0, 0, 1),
 (0, 0, 0, 2),
 (0, 0, 1, 0),
 (0, 0, 1, 1),
 (0, 0, 1, 2),
 (0, 1, 0, 0),
 (0, 1, 0, 1),
 (0, 1, 0, 2),
 (0, 1, 1, 0),
 (0, 1, 1, 1),
 (0, 1, 1, 2),
 (1, 0, 0, 0),
 (1, 0, 0, 1),
 (1, 0, 0, 2),
 (1, 0, 1, 0),
 (1, 0, 1, 1),
 (1, 0, 1, 2),
 (1, 1, 0, 0),
 (1, 1, 0, 1),
 (1, 1, 0, 2),
 (1, 1, 1, 0),
 (1, 1, 1, 1),
 (1, 1, 1, 2)]

It shouldn't make (1, 1, 1, 2) as it sums to 5; while in this example it's only one, in others it will be considerably more.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
Alex
  • 1,064
  • 1
  • 11
  • 16
  • 2
    So why don't you `ifilter` the result according to your requirements? – jonrsharpe Jul 23 '15 at 15:47
  • if ifilter is too much overhead for you (i.e. you don't want most of the terms). You could also do this using a queue and a while loop, or if you know the number of terms then just use a series of for loops. But this will be pure python and therefore only really makes sense if the overhead of throwing out terms is v. high (as itertools is written in c and optimised) – user2539336 Jul 23 '15 at 16:23
  • If I understand ifilter correctly it would only remove the ones after they have been created. Due to computing time I need them to not be made. Please correct me if I am mistaken. – Alex Jul 23 '15 at 16:33
  • 1
    @AlexTreacher yes. Either you can recreate the [itertools.product](https://docs.python.org/2/library/itertools.html#itertools.product) using the equivalent code from doc, so apply your filter. – Mauro Baraldi Jul 23 '15 at 16:39
  • user2539336, do you have an example? I'm not sure how to go about doing that. – Alex Jul 23 '15 at 16:41
  • @AlexTreacher well without creating them, how can you know whether or not their contents sum to `<= 4`?! – jonrsharpe Jul 23 '15 at 17:21
  • well I guess if I could get the itertools to make it in order of increasing sum, then when it got higher that 4 i would stop. – Alex Jul 23 '15 at 18:47
  • In that case, sort `a` and use `takewhile`. – jonrsharpe Jul 23 '15 at 21:09

1 Answers1

1

If your dataset is large, you could probably use numpy here.

numpy.indices provides an equivalent of itertools.product that you can also filter efficiently,

import numpy as np

arr = np.indices((4, 4, 4, 4)).reshape(4,-1).T
mask = arr.sum(axis=1) < 5
res = arr[mask]
print(res)

#[[0 0 0 0]
# [0 0 0 1]
# [0 0 0 2]
# [0 0 0 3]
# [0 0 1 0]
#  ... 
# [3 0 0 1]
# [3 0 1 0]
# [3 1 0 0]]

Otherwise for small datasets, as mentioned in the comments, itertools.ifilter is pretty fast,

from itertools import product, ifilter
gen = product((0,1,2,3), repeat=4)
res = ifilter(lambda x: sum(x) < 4, gen)
res = list(res) # converting to list only at the end

In this particular case, both approaches give comparable performance.

If you need even better performance for this specific case, you can always write your optimized routine in C or Cython.

rth
  • 10,680
  • 7
  • 53
  • 77