1

Lets say I have a function and I want to have the option to return results or not. This would be easy to code:

def foo(N, is_return=False):
    l = []
    for i in range(N):
        print(i)
        if is_return:
            l.append(i)
    if is_return:
        return l

But now lets say I want the function to be a generator. I would write something like this:

def foo_gen(N, is_return=False):
    for i in range(N):
        print(i)
        if is_return:
            yield i

So presumably when is_return is False then foo_gen is just a function with no return value and when is_return is True foo_gen is a generator, for which I would like there to be two different invocations:

In [1]: list(foo_gen(3, is_return=True))
0
1
2
Out[2]: [0, 1, 2]

for when it is a generator and you have to iterate through the yielded values, and:

>>> In [2]: foo_gen(3)
0
1
2

For when it is not a generator and it just has it's side-effect and you don't have to iterate through it. However, this latter behavior doesn't work instead just returning the generator. You can just receive nothing from it:

In [3]: list(foo_gen(3, is_return=False))
0
1
2
Out[3]: []

But this isn't as nice and is confusing for users of an API who aren't expecting to have to iterate through anything to make the side-effects occur.

Is there anyway to make the behavior of In [2] in a function?

salotz
  • 429
  • 4
  • 20
  • Did you see https://stackoverflow.com/questions/23381257/skipping-yield-in-python? – serv-inc Dec 06 '17 at 21:38
  • How about adding a `return None` after the loop? – Anton vBR Dec 06 '17 at 21:44
  • @serv-inc I hadn't, but that is a different problem because I am not looking to only skip particular elements. You would have the problem of skipping all of them of still handling the empty iterator. – salotz Dec 06 '17 at 22:12

2 Answers2

6

To do that, you would need to wrap foo_gen in another function which either returns the generator or iterates over it itself, like this:

def maybe_gen(N, is_return=False):
    real_gen = foo_gen(N)
    if is_return:
        for item in real_gen:
            pass
    else:
        return real_gen

def foo_gen(N):
    for i in range(N):
        print(i)
        yield i

>>> list(maybe_gen(3))
0
1
2
[0, 1, 2]
>>> maybe_gen(3, is_return=True)
0
1
2
>>> 

The reason is that occurrence of yield anywhere in the function makes it a generator function. There's no way to have a function that decides at call time whether it's a generator function or not. Instead, you have to have a non-generator function that decides at runtime whether to return a generator or something else.

That said, doing this is most likely not a good idea. You can see that what maybe_gen does when is_return is True is completely trivial. It just iterates over the generator without doing anything. This is especially silly since in this case the generator itself doesn't do anything except print.

It is better to have the function API be consistent: either always return a generator, or never do. A better idea would be to just have two functions foo_gen that is the generator, and print_gen or something which unconditionally prints it. If you want the generator, you call foo_gen. If you just want to print it, you call print_gen instead, rather than passing a "flag" argument to foo_gen.

With regard to your comment at the end:

But this isn't as nice and is confusing for users of an API who aren't expecting to have to iterate through anything to make the side-effects occur.

If the API specifies that the function returns a generator, users should expect to have to iterate over it. If the API says it doesn't return a generator, users shouldn't expect to have to iterate over it. The API should just say one or the other, which will make it clear to users what to expect. What is far more confusing is to have an awkward API that tells users they have to pass a flag to determine whether they get a generator or not, because this complicates the expectations of the user.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • I completely agree with your concerns about this as an API choice. It is a design decision in my application to do it this way because this is a very high level method in which relatively inexperienced users will want it to DWIM and the `return_results` is sort of reserved for debugging and more advanced users. I'm also looking to keep the namespace as slim as possible. – salotz Dec 06 '17 at 22:18
1

So presumably when is_return is False then foo_gen is just a function with no return value and when is_return is True foo_gen is a generator

You have your assumptions wrong. is_return does not determine if your function is a generator or not. The mere presence of a yield expression determines that, either the expression is reachable at function call or not, doesn't matter.

So you probably want to stick to the first approach of returning a list which in my opinion is less confusing and easier to maintain.

Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • 1
    Your suggestion is what I am going to actually do. I am however curious in the problem as posed anyhow. – salotz Dec 06 '17 at 22:09