29

Guido van Rossum, in his speech in 2014 on Tulip/Asyncio shows the slide:

Tasks vs coroutines

  • Compare:

    • res = yield from some_coroutine(...)
    • res = yield from Task(some_coroutine(...))
  • Task can make progress without waiting for it

    • As log as you wait for something else
      • i.e. yield from

And I'm completely missing the point.

From my point of view both constructs are identical:

In case of bare coroutine - It gets scheduled, so the task is created anyways, because scheduler operates with Tasks, then coroutine caller coroutine is suspended until callee is done and then becomes free to continue execution.

In case of Task - All the same - new task is schduled and caller coroutine waits for its completion.

What is the difference in the way that code executed in both cases and what impact it has that developer should consider in practice?

p.s.
Links to authoritative sources (GvR, PEPs, docs, core devs notes) will be very appreciated.

stamaimer
  • 6,227
  • 5
  • 34
  • 55
Gill Bates
  • 14,330
  • 23
  • 70
  • 138

3 Answers3

31

For the calling side co-routine yield from coroutine() feels like a function call (i.e. it will again gain control when coroutine() finishes).

yield from Task(coroutine()) on the other hand feels more like creating a new thread. Task() returns almost instantly and very likely the caller gains control back before the coroutine() finishes.

The difference between f() and th = threading.Thread(target=f, args=()); th.start(); th.join() is obvious, right?

Andrew Svetlov
  • 16,730
  • 8
  • 66
  • 69
  • So the difference is in how execution will be scheduled by the scheduler? Bare coroutine gets higher "priority" and task gets lower? – Gill Bates Nov 23 '14 at 10:44
  • 2
    There are no priorities in asyncio at all. For bare coroutine you have to use `yield from coro()` for coroutine running, in case of task construction like `async(coro())` will execute coroutine in parallel with others. – Andrew Svetlov Nov 23 '14 at 12:31
  • You mean that there will be no context switch in case of bare coroutine? – Gill Bates Nov 23 '14 at 13:27
  • There is (or, correctly, a coroutine may cause context switch and usually does) but you have to explicitly execute it and wait for finishing via `yield from`. For tasks you have to start task only (`asyncio.async(coro())`) and free to wait until task finished via `yield from` or just go ahead and do something else -- the task will continue own run. Say again, the difference is like diff between function calls and creating new OS thread. – Andrew Svetlov Nov 23 '14 at 13:35
  • I understood you like that there is no difference `yield from some_coro()` and `yield from Task(some_coro())` - correct? Notice that GvR was shown exactly that on his slide - and he was talked like there IS some difference. – Gill Bates Nov 23 '14 at 13:41
  • From **final result** perspective there is **no difference**, yes. Task case is a bit slower though. There **is** difference in **how** code is executed in both cases. – Andrew Svetlov Nov 23 '14 at 13:47
  • So what is the difference? `yield from Task(some_coro(...))` will be executed in the background and `yield form some_coro()` will be executed... how? – Gill Bates Nov 23 '14 at 13:52
  • In current execution context, obviously. If coroutine was executed by `loop.run_until_complete(coro())` the context is foreground. You can also call `yield from` from task. – Andrew Svetlov Nov 23 '14 at 13:59
  • Can you elaborate on what is the "foreground" and what is the "foreground"? Context switch happens in both cases, so both cases can be compared with starting a new thread, because the are both asynchronous. So its sounds like "foreground" is asynchronous execution with higher priority and "background" with lower. – Gill Bates Nov 27 '14 at 12:51
  • Oooh. All contexts have the same priority. Period. As main thread in python program has the same priority as other threads created by user. – Andrew Svetlov Nov 27 '14 at 21:47
  • I'm not stating that priorities are exists, I'm asking for some details about what is "background" and what "foreground" here. – Gill Bates Dec 02 '14 at 16:11
  • Is this statement correct: if you do `yield from bare_coro()` - `bare_coro` will be executed right next of the caller coroutine, with no chance to other coroutines executed before it? Some code: https://gist.github.com/AndrewPashkin/844d17f2ab4f4c88be42 – Gill Bates Dec 03 '14 at 07:49
  • 1
    Yes, you are right. Technically `yield from coro()` executes coroutine immediately, `async(coro())` schedules execution by `loop.call_soon()` call. – Andrew Svetlov Dec 03 '14 at 08:08
  • Is it implementation detail, or it is in specs? I cant find anything in `asyncio` docs about that. Maybe it is implicit spec, hence `yield from bare_coro()` is just a synchronous Python code actually, and when Scheduler calls `next()` on caller coroutine - `bare_coroutine()` invoked synchronously as regular code with no `asyncio` magic? – Gill Bates Dec 03 '14 at 10:05
  • Very interesting by the way, that python 2.7 backport - `trollius` works completely different with [the same code](https://gist.github.com/AndrewPashkin/844d17f2ab4f4c88be42) – Gill Bates Dec 03 '14 at 11:52
  • 1
    Well, it is implementation details . Trollius doesn't use `yield from` and not fully compatible with asyncio. – Andrew Svetlov Dec 03 '14 at 16:03
  • 1
    Custom event loops that fully compatible with asyncio utilizes `yield from` and reuses asyncio.Task which calls `loop.call_soon()`. Non 100% compatible systems may invite own contracts and have own implementation details. – Andrew Svetlov Dec 03 '14 at 16:05
  • What if consider alternative Python implementations, like PyPy? If they will implement 3.4 and will strictly follow official specs, should they reproduce that "sycnrhonous execution" feature? – Gill Bates Dec 03 '14 at 16:09
  • 1
    Yes, if PyPy will support 3.4 (only 3.2 for now AFAIK) it will work with coroutines in the same way as CPython 3.4 does. At least I think so but cannot guarantee. I'm CPython Core developer, not PyPy one. – Andrew Svetlov Dec 03 '14 at 16:38
  • What I mean is that does Python _language specification_ asserts that `yield from bare_coro()` **must** work like that in any implementation? – Gill Bates Dec 03 '14 at 19:20
  • 1
    Yes, PEP 380 explicitly states this. – Andrew Svetlov Dec 03 '14 at 20:19
  • That was it. Now i'm enlightened, thank you! I think you should update the answer with this info. – Gill Bates Dec 04 '14 at 10:56
17

The point of using asyncio.Task(coro()) is for cases where you don't want to explicitly wait for coro, but you want coro to be executed in the background while you wait for other tasks. That is what Guido's slide means by

[A] Task can make progress without waiting for it...as long as you wait for something else

Consider this example:

import asyncio

@asyncio.coroutine
def test1():
    print("in test1")


@asyncio.coroutine
def dummy():
    yield from asyncio.sleep(1)
    print("dummy ran")


@asyncio.coroutine
def main():
    test1()
    yield from dummy()

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Output:

dummy ran

As you can see, test1 was never actually executed, because we didn't explicitly call yield from on it.

Now, if we use asyncio.async to wrap a Task instance around test1, the result is different:

import asyncio

@asyncio.coroutine
def test1():
    print("in test1")


@asyncio.coroutine
def dummy():
    yield from asyncio.sleep(1)
    print("dummy ran")


@asyncio.coroutine
def main():
    asyncio.async(test1())
    yield from dummy()

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Output:

in test1
dummy ran

So, there's really no practical reason for using yield from asyncio.async(coro()), since it's slower than yield from coro() without any benefit; it introduces the overhead of adding coro to the internal asyncio scheduler, but that's not needed, since using yield from guarantees that coro is going to execute, anyway. If you just want to call a coroutine and wait for it to finish, just yield from the coroutine directly.

Side note:

I'm using asyncio.async* instead of Task directly because the docs recommend it:

Don’t directly create Task instances: use the async() function or the BaseEventLoop.create_task() method.

* Note that as of Python 3.4.4, asyncio.async is deprecated in favor of asyncio.ensure_future.

dano
  • 91,354
  • 19
  • 222
  • 219
3

As described in PEP 380, the accepted PEP document that introduced yield from, the expression res = yield from f() comes from the idea of the following loop:

for res in f():
    yield res

With this, things become very clear: if f() is some_coroutine(), then the coroutine is executed. On the other hand, if f() is Task(some_coroutine()), Task.__init__ is executed instead. some_coroutine() is not executed, only the newly created generator is passed as the first argument to Task.__init__.

Conclusion:

  • res = yield from some_coroutine() => coroutine continues execution and returns the next value
  • res = yield from Task(some_coroutine()) => a new task is created, which stores a non-executed some_coroutine() generator object.
filmor
  • 30,840
  • 6
  • 50
  • 48
hdante
  • 7,685
  • 3
  • 31
  • 36