10

Having this simple code:

import asyncio

async def main():
    f = asyncio.Future()
    await f

asyncio.run(main())

A coroutine(here main) can await on Future object. It's basically blocked until f either have a result or an exception set, or until they are cancelled.

Out of curiosity I wanted to know "How this waiting occurs"? I checked the Python implementation of Task.py, specifically def __step().

In it's simplest form with happy cases and when the returned result is a Future, it's:

.
.
result = coro.send(None)
.
.
blocking = getattr(result, '_asyncio_future_blocking', None)
.
.
if blocking:   # So `result` is a Future with `_asyncio_future_blocking == True`
    result._asyncio_future_blocking = False
    result.add_done_callback(self.__wakeup, context=self._context)
    self._fut_waiter = result
    if self._must_cancel:
        if self._fut_waiter.cancel(msg=self._cancel_message):
            self._must_cancel = False

I got all my other answers from this section(like when the result is a bare yield or CancellationError happens and etc.) except this one!

So it sets the result._asyncio_future_blocking to False and the add e callback to the Future. But this __wakeup is only gets called when the Future is done but it's not done yet. I can't see any self._loop.call_soon(self.__step). All I can say is somebody is watching the self._fut_waiter. I don't know the rest of the story.

Apparently a comment in Task class's proves that:

    # An important invariant maintained while a Task not done:
    #
    # - Either _fut_waiter is None, and _step() is scheduled;
    # - or _fut_waiter is some Future, and _step() is *not* scheduled.
    #
    # The only transition from the latter to the former is through
    # _wakeup().  When _fut_waiter is not None, one of its callbacks
    # must be _wakeup().

Is it somehow registered to the select function? I would appreciate if someone tells me where these futures are getting checked? How these futures communicate with the event loop? where is the waiting area?

I saw David Beazley's lecture on AsyncIO and I understood how his event loop works, but in AsyncIO framework, everything is a Future not file descriptors or sockets. So what is Future's role?

S.B
  • 13,077
  • 10
  • 22
  • 49
  • 1
    I can follow to this point: When a task T awaits a future F that is not done, the T will stop doing its steps and registers a callback to its wakeup that will be activated when the F is done. But why do you think any other action is needed, e.g. a call_soon() is needed? The T is sleeping and the F lives its own life. – VPfB Aug 02 '22 at 06:27
  • @VPfB Thanks for the comment. That is exactly what I want to know, *who* finds out that the Future is done? These yielded Futures should be checked in a place, in a waiting area. *where* is that place? If I want to tell the same for David's event loop, He uses tuples `(action, fd)`, These file descriptors/sockets are checked with a `select` call. Event loop waits for this call at some point and whenever `fd`s are ready it runs their callback(which may or may not add them back to the event loop). It's a bit hard for me to follow this pattern with Futures. – S.B Aug 02 '22 at 07:06
  • 1
    A Future is done when somebody sets its result (or exception). That action triggers callbacks, one of them will awake the task waiting on it. What action leads to this end differs for each use-case. Examples: a taks finishes, new data arrives, a timeout occurs. And the `select` or `poll` in the event loop is not directly related to Futures. – VPfB Aug 02 '22 at 09:28
  • 1
    A Future or Future-like object is a subclass of a handle and handles are frequently checked in the event-loop. Take a look at this [awesome video](https://www.youtube.com/watch?v=E7Yn5biBZ58)/[series](https://www.youtube.com/playlist?list=PLhNSoGM2ik6SIkVGXWBwerucXjgP1rHmB). Look at minute 22 in the first link. – Thingamabobs Aug 05 '22 at 02:29
  • 1
    @Thingamabobs [`Future`](https://github.com/python/cpython/blob/57446f9e3321aedad8c1f7398f8b64d7ec54f2e7/Lib/asyncio/futures.py#L30) doesn't subclass anything, so I'm not sure what you mean by your first sentence. Also, as written in [the docs](https://docs.python.org/3/library/asyncio-future.html), "*Future* objects are used to bridge **low-level callback-based code** with high-level async/await code". Futures use callbacks, so how is this related to the event loop? – a_guest Aug 05 '22 at 13:15
  • @a_guest thank you for correcting me here. I was under the impression, don't know how I came up with this, that every awaitable would be a handle. This is not the case and I should have refreshed my memory before sending out false and half baked informations. – Thingamabobs Aug 05 '22 at 14:39
  • 1
    Without Futures we would write code like this: new data is requested, when it arrives, call handler1, when a timeout occurs call handler2. This approach is not manageable except in the most trivial cases. With await but still without Futures we would need to create a specialized awaitable for every possible async operation. With Futures we have a general mechanism. No matter what operation is performed, when it finishes, it will set either the result or the exception at the corresponding Future. This will trigger callbacks that will wakeup all waiters. – VPfB Aug 05 '22 at 20:19
  • @VPfB Thanks for explanation. I knew this. Theoretically it's clear to me why we have Futures. There is a gap in my understanding and when I read the source code I can not find it. Let's say we are awaiting on a Future and the result is not ready yet. As far as I can tell, [here in base_events.py](https://github.com/python/cpython/blob/main/Lib/asyncio/base_events.py#L1894) module, `self._ready.popleft()` is called and ready callbacks are getting run. They are Task's `__step()` methods right? when a future received where is it rescheduled to the event loop? – S.B Aug 05 '22 at 20:35
  • @VPfB They have to be checked over and over until the result (or exception) is set on them. For "bare yield" or when exception occurs, I can easily say [from here] that they are getting rescheduled. But if you check the snippet I provide in my question, I can not say what happens to the received future. – S.B Aug 05 '22 at 20:55
  • My only guess is it happens inside the [Handle._run()](https://github.com/python/cpython/blob/main/Lib/asyncio/events.py#L80) – S.B Aug 05 '22 at 20:56
  • 1
    @S.B Futures are not stored in some registry, not polled or checked. They are not related to the event loop. They just exist as objects and that's why you cannot find them in the source code. Take for example the program from your question. It will just wait forever until you add another async task that will eventually call `f.set_result`. That will trigger callbacks and awake the waiting taks(s) as I wrote earlier. – VPfB Aug 06 '22 at 05:30
  • @VPfB I think your statement isnt entirely true, since there is at least a list of all pending tasks. You can find it in the [documentation](https://docs.python.domainunion.de/3/library/asyncio-task.html#asyncio.all_tasks). Though I wonder if this is just a feature and irrelevant to await a Future object. The lack of my answer below is, where do the coroutines check if the futures have a result. It still feels like a donut and the whole point isn't filled yet. – Thingamabobs Aug 06 '22 at 06:12
  • 1
    @Thingamabobs Well, the `asyncio` is responsible for scheduling Tasks and simply must have corresponding data structures for that (all tasks, they state, etc.). While it is true that Tasks are derived from Futures, I was talking about the Futures and I hope I was correct that they are not collected in some set or list and they are not checked "over and over". Long time ago, before `asyncio` was added to the std library I wrote an own simple asyncio-like library for my project. – VPfB Aug 06 '22 at 07:25
  • 1
    @VPfB thanks for your reply. I don't doubt that you know more than me about this topic. I guess the missing part is the syntax `await` and [this](https://stackoverflow.com/a/48261042) makes me feel it is resolved. But I still don't feel enlightened, because now I wonder where the actual `yield` the end of the chain happens, which leads back to the original question. *where is the waiting area?* – Thingamabobs Aug 06 '22 at 07:55
  • 1
    @Thingamabobs I'm afraid we're slowly going off-topic. The `await` is indeed like `yield from`, it builds a chain of `yield from`s. At the start of this chain is the asyncio (`Task.__step`) and at the end of the chain must be a `yield` in a coroutine. This `yield` sends a Future through that chain to the asyncio core telling it "I can't continue until this Future is done, please schedule another task". – VPfB Aug 06 '22 at 08:05
  • 1
    @VPfB to continue your last sentence, These `__step` callbacks are wrapped inside `Handle` objects, and these Handle objects are stored inside a `_ready` queue. When event loop wants to run them, if first *pop* them from the queue. When the result is not ready, they are in PENDING state. So someone here should check them frequently to see *when* they are ready. Can we just say ok it's not ready let's run another callback? then what about that yielded Future after the result is ready? – S.B Aug 06 '22 at 08:16
  • 1
    @S.B. When a Future becomes "done", the state of tasks waiting for it changes and asyncio will start to schedule them again. Until that moment there is no need to check those not ready-to-run tasks nor the Futures they are waiting for. Again, when the event making the Future "done" happens, the callbacks will take care of that. I have no more information or comments to add. – VPfB Aug 06 '22 at 08:42
  • Suppose we have Task1, Task2. when Task1 yields a Future and it's not done, event loop goes and runs Task2. **while it's running Task2**, the Task1 becomes ready. But who realizes that it's ready? Python is executing Task2's body at that time and has no idea about that Future. AsyncIO is running in a *single thread*. Yes if it finds out that it's ready it will schedule Future's callback. But in order to find out that it's ready, those Futures have to be checked somehow/somewhere when a full cycle of the callbacks executed. – S.B Aug 06 '22 at 08:53
  • @S.B I still believe it is implemented in the C code. For example, I use Windows11 and therefore the [ProactorEventLoop](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.ProactorEventLoop) and in the further MSDN documentation they state that they use [OVERLAPPED](https://learn.microsoft.com/en-us/windows/win32/api/minwinbase/ns-minwinbase-overlapped) structures which takes in the `hEvent` and thus can be customized. We maybe looking for something on (kernel?-)level nobody really works with. – Thingamabobs Aug 06 '22 at 09:19
  • @Thingamabobs Yes, I believe the same – S.B Aug 06 '22 at 09:27
  • 1
    @S.B Regarding your comment about a Future becoming ready while executing a Task2 - this cannot happen unless the Task2 itself directly marks the Future as done. No other code is run untill the Task2 relinquishes the CPU in an `await`. Then the event loop processes pending I/O (descriptors), and pending events (`_ready` queue) and selects the next task to run from the set of ready-to-run tasks. If the mentioned processing of pending I/O and events had the effect of marking the Future as ready, the Taks1 would be moved to the set of tasks ready-to-run and now becomes a chance to run on the CPU. – VPfB Aug 06 '22 at 12:47
  • @S.B I'm afraid but your question was already answered [here](https://stackoverflow.com/a/59780868/13629335). And my answer had enough information but a lack of understanding. – Thingamabobs Aug 06 '22 at 15:45

1 Answers1

1

First of all, I dont feel qualified to answer your question and I do hope someone with more knowledge will get in touch with this topic. But since there are no other answers yet, I'll try my best to serve a qualified answer to this interesting topic.

A Future object is an awaitable Object. Awaitable Objects schedule themselves mostly as Tasks i.e due asyncio.gather; wait_for or wait. The Future object however is the lower level parent of Task object.

The key differences between Task and Future are that:

Coroutines can await on Future objects until they either have a result or an exception set, or until they are cancelled.

and

Tasks are used to run coroutines in event loops.

The above should answer your question So what is Future's role?. Your next questions are a little bit harder to answer, but its important to note that coroutines await Future object.

To discover It may helps to understand first what coroutines are. Couroutines are generator based objects and the syntax yield from and yield, with the methods send and throw makes this all happen.

throw(), send() methods for coroutines are used to push values and raise errors into Future-like objects.

[Source]

But all of this doesn't answer your question How these futures communicate with the event loop?. In fact there is no Scheduler and the communication is achieved by

using yield from future and yield from task.

Sources I use/used for learning asyncio:

Thingamabobs
  • 7,274
  • 5
  • 21
  • 54
  • I would like to add [this thread](https://stackoverflow.com/q/49005651/13944524) to your sources. Literally awesome answers. btw I saw both of the videos. They are fantastic. – S.B Aug 05 '22 at 20:50
  • @S.B thanks for making me aware of it. I sure will take a closer look. I hope this leads you to a point where you find your actual answer. Or someone else can contribute a more sufficient answer. I believe, but did not find a hint for it, that the `yields` happen in the C implementation. But since I don't know C, it does not make any sense for me to take a look at it. – Thingamabobs Aug 05 '22 at 20:55