3

Imagine I have a set of functions like this:

def func1():
    func2()

def func2():
    time.sleep(1)  # simulate I/O operation
    print('done')

I want these to be usable synchronously:

# this would take two seconds to complete
func1()
func1()

as well as asynchronously, for example like this:

# this would take 1 second to complete
future = asyncio.gather(func1.run_async(), func1.run_async())
loop = asyncio.get_event_loop()
loop.run_until_complete(future)

The problem is, of course, that func1 somehow has to propagate the "context" it's running in (synchronously vs. asynchronously) to func2.

I want to avoid writing an asynchronous variant of each of my functions because that would result in a lot of duplicate code:

def func1():
    func2()

def func2():
    time.sleep(1)  # simulate I/O operation
    print('done')

# duplicate code below...
async def func1_async():
    await func2_async()

async def func2_async():
    await asyncio.sleep(1)  # simulate I/O operation
    print('done')

Is there any way to do this without having to implement an asynchronous copy of all my functions?

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
  • Depends on what the functions are actually doing, rather than dummy functions for the purpose of asking the question. But usually, you'd call upon `multiprocessing` or `threading` to execute calls in parallel with your other code. And you could use it interchangeably, meaning you won't have to thread a function every time. Pretty basic use of threads/processes. – Torxed Nov 02 '18 at 21:27
  • 1
    @Torxed If it matters, the functions would be doing asynchronous HTTP requests. I *could* parallelize the whole thing with multithreading, but I'd really rather not. async has a number of advantages compared to multiprocessing and multithreading. The goal is to end up with good code, not to parallelize at all costs. – Aran-Fey Nov 02 '18 at 21:45
  • I don't know your use case, but rather than requiring `func1()` to always use `func2()`, could you give it some sort of logic in the input args to run `func2()` if given it (and/or by default), or use another static input if not given it? – G. Anderson Nov 02 '18 at 21:49
  • @G.Anderson I'm open to any solution that's better than duplicating my entire code base :) – Aran-Fey Nov 02 '18 at 21:54
  • In that case, one option is to create `func1` as `def func1(use_f2=True): if use_f2: x=func2() else: x=staticvariable` then if you want to use it independently, call it with `func1(False)` – G. Anderson Nov 02 '18 at 21:56
  • @G.Anderson Sorry, I don't follow. `func1` shouldn't use a static value; it should always call `func2` - either synchronously or asynchronously, depending on how `func1` was called. – Aran-Fey Nov 02 '18 at 22:04
  • It sounds odd to me why multiprocessing/threading is equal to bad code. It sounds like just the thing you would need. Unless you're purely on Linux then you could make use of [from select import epoll](https://github.com/Torxed/Scripts/blob/master/python/epoll.py) and get async code but still using sync-code. But again, not sure why multiprocessing is a bad choice here, since you can call individual function as a process or not as a process, up to you per individual call. Not all calls would automatically be multiprocessed just because you include it into the mix. – Torxed Nov 02 '18 at 22:07
  • If you want `func1` to always depend on `func2` but also to be able to run `func1` without running `func2`, then I'm afraid I'm out of my depth. I think @Torxed is on the right track. – G. Anderson Nov 02 '18 at 22:15
  • @Torxed Let's not turn this into a discussion about whether async/await is better than multithreading. It's easy to write a `run_async` function that starts `func1` in a new thread; but I want to know if a similar thing is possible with `async`. – Aran-Fey Nov 02 '18 at 22:55
  • People will be reading these posts, and it's important especially for new programmers not to think multiprocessing is a taboo or considered bad code. Hence my note on the subject. And I'm starting to question what you mean by `async`. When I say `async`, I'm referring to the principle of executing functions independent of the main program flow. Not a library. Doing what you just described, creating a `run_async` is by definition the solution you described as *"possible with async"*, unless you're talking about a library, if so, which one are you talking about? – Torxed Nov 02 '18 at 23:17
  • 1
    @Torxed When I say `async` I mean python's `async` keyword, and when I say async (without the code formatting) I mean asynchronous execution (: – Aran-Fey Nov 02 '18 at 23:20
  • Is it important that the `async` nature of the API be *hidden* if it's not used? If you wrote all your code using `async` it would be easy to add synchronous entry points that would just use the event loop (or something) to run the async versions. You might even be able to use a decorator to generate synchronous wrappers if desired. – Daniel Pryden Nov 03 '18 at 00:05
  • @DanielPryden I'm not entirely sure if I understand your question, but I would prefer if calling `func1()` would run it synchronously rather than asynchronously. If I have to define all my functions with `async def`, that's fine. If I have to wrap a decorator around each function, that's also fine. But I would like the interface to be as simple as possible, so it would be nice if the functions would act like normal synchronous functions when they're called "normally". – Aran-Fey Nov 03 '18 at 00:10
  • 1
    @Torxed: `multiprocessing` isn't intrinsically bad, but it is relatively fundamentally broken on some platforms (e.g. recent versions of macOS). `threading` isn't intrinsically bad, but as a programming paradigm, threads are difficult to get right, and CPython doesn't handle multiple threads optimally. Asynchrony isn't the same thing as concurrency, and for operations that are latency-sensitive but IO-bound, asynchronous operations can outperform naively concurrent ones, especially by reducing the CPU and memory overhead of each operation. The `async` keyword was added for a good reason! – Daniel Pryden Nov 03 '18 at 00:12
  • As a small footnote I'd like to apologize, I completely missed the `async` keyword in `async def func1_async()`. I must have been tired or to much in a hurry. And those are valid concerns @DanielPryden and a thurrow explanation of the benefits and downsides of each of the options. – Torxed Nov 03 '18 at 07:42
  • @Torxed The tired one was actually me, I forgot to add the `async` keyword in [the initial revision](https://stackoverflow.com/revisions/53125982/1). Sorry about that :( – Aran-Fey Nov 03 '18 at 08:09

2 Answers2

3

Here's my "not-an-answer-answer," which I know that Stack Overflow loves...

Is there any way to do this without having to implement an asynchronous copy of all my functions?

I don't think that there is. Making a "blanket translator" to convert functions to native coroutines seems next-to-impossible. That's because making a synchronous function asynchronous is about more than throwing an async keyword in front of it and a couple of await statements within it. Keep in mind that anything that you await must be awaitable.

Your def func2(): time.sleep(1) illustrates that point. Synchronous functions will make blocking calls, such as time.sleep(); asynchronous (native coroutines) will await non-blocking coroutines. Making this function asynchronous, as you point out, requires not just using async def func(), but awaiting asyncio.sleep(). Now let's say instead of time.sleep(), you're calling a more complex, blocking function. You build some sort of fancy decorator that slaps a function attribute called run_async, which is a callable, onto the decorated function. But how does that decorator know how to "translate" the blocking calls within func2() into their coroutine equivalents, if those are even defined? I can't think of any magic that would be smart enough to convert all of the calls in a synchronous function to their awaitable counterparts.

In your comments, you mention that this is for HTTP requests. For a real-world example the differences in call signatures and APIs between the requests and aiohttp packages. In aiohttp, .text() is an instance method; in requests, .text is a property. How could you build something smart enough to know differences such as that?

I don't mean to be discouraging--but I think that using threading would be more realistic.

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
  • It's very true that automagically making a synchronous function asynchronous is next to impossible, but what if we do it the other way round? If I define `func1` and `func2` with `async def`, surely there must be a way to turn them into synchronous functions? (Or at least make them *appear* synchronous to the caller?) – Aran-Fey Nov 03 '18 at 00:15
  • I won't say it's impossible. A coroutine is just a [repurposed generator](https://docs.python.org/3/library/asyncio-task.html#generator-based-coroutines); perhaps that property would be workable for simple cases. But again, it's not just a syntactic difference; it's a behavioral one. I'd be interested to see the same thing if somehow it did exist, but nothing comes to mind immediately. – Brad Solomon Nov 03 '18 at 00:18
  • 1
    Good explanation, I would just like to leave the [select](https://docs.python.org/3/library/select.html) library. Al tho it's not a generic solution to the question which was put as a overall question about blocking operations, it might solve the very specific problem of network calls being a blocking operation. Other than that, great descriptive answer. – Torxed Nov 03 '18 at 07:45
0

So I found a way to achieve this, but since this is literally the first time I've done anything with async I can't guarantee that this doesn't have any bugs or that it's not a terrible idea.

The concept is actually pretty simple: Define your functions like normal asynchronous functions using async def and await where necessary, and then add a wrapper around them that automatically awaits the function if no event loop is running. Proof of concept:

import asyncio
import functools
import time


class Hybrid:
    def __init__(self, func):
        self._func = func

        functools.update_wrapper(self, func)

    def __call__(self, *args, **kwargs):
        coro = self._func(*args, **kwargs)

        loop = asyncio.get_event_loop()

        if loop.is_running():
            # if the loop is running, we must've been called from a
            # coroutine - so we'll return a future
            return loop.create_task(coro)
        else:
            # if the loop isn't running, we must've been called synchronously,
            # so we'll start the loop and let it execute the coroutine
            return loop.run_until_complete(coro)

    def run_async(self, *args, **kwargs):
        return self._func(*args, **kwargs)


@Hybrid
async def func1():
    await func2()

@Hybrid
async def func2():
    await asyncio.sleep(0.1)


def twice_sync():
    func1()
    func1()

def twice_async():
    future = asyncio.gather(func1.run_async(), func1.run_async())
    loop = asyncio.get_event_loop()
    loop.run_until_complete(future)


for func in [twice_sync, twice_async]:
    start = time.time()
    func()
    end = time.time()
    print('{:>11}: {} sec'.format(func.__name__, end-start))

# output:
#  twice_sync: 0.20142340660095215 sec
# twice_async: 0.10088586807250977 sec

However, this approach does have its limitations. If you have a synchronous function calling a hybrid function, calling the synchronous function from an asynchronous function will change its behavior:

@hybrid
async def hybrid_function():
    return "Success!"

def sync_function():
    print('hybrid returned:', hybrid_function())

async def async_function():
    sync_function()

sync_function()  # this prints "Success!" as expected

loop = asyncio.get_event_loop()
loop.run_until_complete(async_function())  # but this prints a coroutine

Take care to account for this!

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
  • I'm not the downvoter here, and I think the wrapper + `__call__` is a cool idea, but this isn't going to do what you are hoping, I don't think – Brad Solomon Nov 03 '18 at 23:14
  • @BradSolomon Could you elaborate? Judging from the timings, it seems to be doing exactly what I want. – Aran-Fey Nov 03 '18 at 23:17
  • Again at the risk of sounding purely critical (I am actually intrigued by this question and attempt): I think you may be confusing asynchronicity with concurrency. `loop.run_until_complete(coro)` in the "synchronous" `__call__` is running a coroutine, not a native function. It's the same reason that `async for` does *not* schedule concurrent execution, it just makes `for` work with coroutines – Brad Solomon Nov 03 '18 at 23:50
  • In other words, calling `loop.run_until_complete(coro); loop.run_until_complete(coro)` may double the time to completion and therefore give you the "look and feel" of synchronous code, but it is still asynchronous by definition – Brad Solomon Nov 03 '18 at 23:51
  • I also think you may have trouble emulating this design with more complex coroutines that involve `aiohttp`, but I would encourage you to give it a shot and also to seek critique from the people on SO who know a ton more about asyncio and the async IO model than I do, because I'm still in the fairly beginning stages with it relatively speaking. – Brad Solomon Nov 03 '18 at 23:52
  • 1
    Here's a demonstration of my first point: https://gist.github.com/bsolomon1124/d9320c6b7cf9c8ab0dc3abb23a7541a9. Notice the time difference between `bar()` and `foobar()`. A doubling of time does not imply that that the coroutines were somehow converted to "native" `def` functions. So, what's happening is that while remaining async in nature, the coroutines are no longer scheduled to run concurrently. Coroutines (generators) can suspend execution while maintaining state; functions can't. I don't know if you can overcome that fundamental difference. – Brad Solomon Nov 04 '18 at 00:01
  • @BradSolomon Thanks for the explanation. Being a total `asyncio` newbie, I'm not confident I understand everything you said, but I do think that the solution I came up with works. I understand that the `__call__` wrapper is only a facade and everything is still asynchronous under the hood, but I don't see a problem with that. All I wanted was to have an interface that behaves like any other synchronous function would, so that users who aren't familiar with async programming can still use my functions with no problems. – Aran-Fey Nov 04 '18 at 00:18
  • 1
    Okay, you make a fair point--what you've been able to do here is provide an "API-like" way, via a wrapper, to await a coroutine without `await`. I think that's cool, but the only thing I would point out is that `async`/`await` were introduced for a reason: to be explicit. You can write a coroutine as a "generator-based coroutine" without them, but that made it ambiguous what was a coroutine in the first place. I can't say what side effects you might see with your approach, but it's something to be wary of. – Brad Solomon Nov 04 '18 at 00:20