how to avoid unnecessary computations in a sorted nested loop generator?

Question

I have a generator that implements evaluation of function f() over a grid of parameters:

def gen():
  for a in vec_a:
    for b in vec_b:
      for c in vec_c:
        for ...
          res = f(a, b, c, ...)
          yield (a, b, c, ...), res

vec_* are sorted such that f() is an increasing function of corresponding parameter given all others fixed. More precisely:

if (a2 >= a1) and (b2 >= b1) and (c2 >= c1) and ...:
  assert f(a2, b2, c2, ...) >= f(a1, b1, c1, ...)

Thus, for example, if f(a0, b0, c0, ...) == np.inf, then:

f(a, b0, c0, ...) == np.inf for every a >= a0
f(a0, b, c0, ...) == np.inf for every b >= b0
f(a0, b0, c, ...) == np.inf for every c >= c0

Now I want to write a generic generator that accepts gen and skips unnecessary evaluations of f according to the following rules:

if f(.) == np.inf at some point then I break the innermost loop
if the innermost loop was interrupted at the very first iteration, I should break the penultimate loop level
rule #3 applies to all other levels of the nested loop

Example: If I get np.inf at the very first iteration of the grid, I should skip the entire grid and not perform any more evaluations of f.

Example: If I have a grid [0,1,2] x [0,1,2] x [0,1,2] and f(0,1,0) == np.inf, then I jump to evaluation f(1, 0, 0).

How would I implement such a generator?

It looks like you are using [tag:numpy]. Is that true? That might make some solution possible. If so, would it be accurate to say that you'd want to iterate over all cells of the array? — Gloweye, Oct 22 '19 at 14:23
@Gloweye in the code above I only use np.inf, which is not essential. I could use None or any other way of returning useless value. However, I am ok using any library that would achieve my goal in an elegant and generic way. My goal is to use the sparsity structure that I have in the n-dimensional grid (given by the monotonicity of the function f wrt to its arguments) to accelerate iteration over the grid by skipping (possibly) many function invocations for which I know result is going to be np.inf. — Mikhail Shevelev, Oct 22 '19 at 14:33
Ok, that makes it clearer. I was considering an answer involving the `shape` attribute on a numpy array, but that's not gonna fly then. — Gloweye, Oct 22 '19 at 14:37
1) Is the grid square (are all the vec_tors the same length)? 2) Are you saying that if f(0, 4, 0) = INF then any call to f(x, 4, y) will be INF? — FiddleStix, Oct 22 '19 at 15:06

orlp · Accepted Answer · 2019-10-22T15:39:08.573

Use recursion to simplify the generator:

def shortcircuit_eval(*vecs, f, prefix=tuple()):
    if len(vecs) == 0:
        yield prefix, f(*prefix), True
        return

    first_vec, *rest_vecs = vecs
    for i, x in enumerate(first_vec):
        inner = shortcircuit_eval(rest_vecs, f=f, prefix=prefix + (x,))
        for args, res, all_inner_first_iter in inner:
            yield args, res, all_inner_first_iter and i == 0
            if res == np.inf and all_inner_first_iter:
                return

Then you can use it like shortcircuit_eval(vec_a, vec_b, vec_c, f=f). It does generate some auxiliary information as a third element of the tuples it yields, you can write a short wrapper to strip those if necessary.

Note that this is a direct implementation of your idea for an algorithm, but this isn't optimal. E.g. when iterating over [0..10]^3 if you find that [1, 5, 2] is infinite, you know that [3, 6, 3] is as well, but your algorithm will not skip over it. Write a new question asking for an optimal algorithm if you're interested :) This dominance graph should illustrate how much work you can actually save:

If any node is infinite you no longer have to compute the entire subgraph that has that node as an ancestor.

score 0 · Answer 2 · answered Oct 23 '19 at 08:55

I read about methods of breaking out of loops (see for example this and this), and finally came to the conclusion that this approach will probably lead to some confusing and badly readable code.

If you look at your grid, you can think of an INF element as the corner of a rectangle (or hyperrectangle, in the n-dimensional case), making all other elements inside the area also INF:

In this example, you would somehow have to remember that for a >= 3 you will have to continue running through the inner b loop with b in (0, 1) only.

Also note that you can possibly have more than one INF element, giving you several INF areas possibly overlapping each other.

So my suggestion is not to break the loops, but rather to decide for each element if you have to evaluate f() or if you can skip it, based on whether or not it is inside an INF area.

Example for three dimensions:

inf_list = []

def can_skip(a, b, c):
  for inf_element in inf_list:
    if a >= inf_element[0] and b >= inf_element[1] and c >= inf_element[2]:
      return True
  return False

def gen():
  for a in vec_a:
    for b in vec_b:
      for c in vec_c:
        if can_skip(a, b, c):
          # print("Skipping element ", a, b, c)
          yield (a, b, c), np.inf
        else:
          # print ("Calculating f for element", a, b, c)
          res = f(a, b, c)
          if res == np.inf:
            inf_list.append((a, b, c))
          yield (a, b, c), res

how to avoid unnecessary computations in a sorted nested loop generator?

2 Answers2