0

I am trying to create a decorator for an async function that restores cwd. I am using this as a reference: How do I write a decorator that restores the cwd?

This is what I came up with. It's not preserving the cwd, any pointers?

import os
import asyncio

def preserve_dir(function):
    async def decorator(*args, **kwargs):
        cwd = os.getcwd()
        result = function(*args, **kwargs)
        os.chdir(cwd)
        return await result
    return decorator


@preserve_dir
async def ordinary():
    os.chdir("/tmp")
    print(os.getcwd())


print(os.getcwd())
asyncio.run(ordinary())
print(os.getcwd())
tomarv2
  • 753
  • 3
  • 14
  • The working directory is global state. Trying to change that in an async program is dangerous, just like with threading. What would you expect to happen if two coroutines both used your `preserve_dir` decorator? – user2357112 Oct 16 '20 at 19:22
  • Suppose your program starts in directory `dir1`. Coroutine 1 starts and changes the directory to `dir2`, then coroutine 2 starts and changes the directory to `dir3`, then coroutine 1 finishes, then coroutine 2 finishes. What's the working directory after coroutine 1 finishes? What's the working directory at the end? – user2357112 Oct 16 '20 at 19:24
  • In all cases, I want to go to a fixed directory after the downstream code completes irrelevant of status. The preserve dir is a fixed path in the larger code base. – tomarv2 Oct 16 '20 at 19:27
  • @user2357112supportsMonica With the code from my answer, the working directory will switch on resumption and suspension of either coroutine. coroutine1 will consistently observe `dir2`, coroutine2 will observe `dir3`, and the rest of asyncio will only ever observe `dir1` as the current working directory. The working directory at the end will be `dir1`. – user4815162342 Oct 17 '20 at 16:53
  • @user4815162342: The problem with that approach is that it interacts confusingly with subcoroutines. If a coroutine decorated with `preserve_dir` does `await subcoro()`, then `subcoro` will see the new directory, but if it does `await asyncio.gather(subcoro1(), subcoro2())`, then `subcoro1` and `subcoro2` will see the original directory. The results depend on whether the coroutine iterates over a subcoroutine directly or schedules it on the event loop, which is hard to control, and it's something you usually don't need to think about. – user2357112 Oct 18 '20 at 07:05
  • @user2357112supportsMonica Good point, I've now mentioned that in the answer. – user4815162342 Oct 18 '20 at 08:05

2 Answers2

3

As others have pointed out, you need to await the decorated function before restoring the working directory, as invoking an async function doesn't execute it.

As others have also pointed out, doing this correctly is much harder than it seems because a coroutine can suspend to the event loop while running and, while it is suspended, a different coroutine can change the directory using the same decorator. With a simple-minded implementation of the decorator, resuming the original coroutine would break it because the working directory would no longer be the one it expects. Ideally you'd avoid this issue by structuring your code so it doesn't rely on the current working directory. But technically, it is possible to implement a correct directory-preserving decorator, it just takes additional effort. While I don't recommend that you do this in production, if you're curious how to do it, read on.

This answer shows how to apply a context manager every time a coroutine is resumed. The idea is to create a coroutine wrapper, an awaitable whose __await__ invokes the __await__ of the original coroutine. Normally this would use yield from, but our wrapper doesn't do so, and instead emulates it with a hand-written loop that uses send() to resume the inner awaitable's iterator. This provides control over every suspension and resumption of the inner awaitable, which is utilized to apply the context manager on every resumption. Note that this requires a reusable context manager, one that can be entered more than once.

To implement the decorator we will need a reusable directory-preserving context manager that not only restores the previous working directory in __exit__, but also re-applies it in the next __enter__. The former restores the old working directory whenever the coroutine is suspended (or when it returns), and the latter restores the new working directory whenever the coroutine is resumed. The decorator will just pass this context manager to the coroutine wrapper:

# copy CoroWrapper from https://stackoverflow.com/a/56079900/1600898

# context manager preserving the current directory
# can be re-entered multiple times
class PreserveDir:
    def __init__(self):
        self.inner_dir = None

    def __enter__(self):
        self.outer_dir = os.getcwd()
        if self.inner_dir is not None:
            os.chdir(self.inner_dir)

    def __exit__(self, *exc_info):
        self.inner_dir = os.getcwd()
        os.chdir(self.outer_dir)

def preserve_dir(fn):
    async def wrapped(*args, **kwds):
        return await CoroWrapper(fn(*args, **kwds), PreserveDir())
    return wrapped

This setup passes not only your original test, but a more involved test that spawns multiple concurrent coroutines that use the same decorator to different directories. For example:

@preserve_dir
async def ordinary1():
    os.chdir("/tmp")
    print('ordinary1', os.getcwd())
    await asyncio.sleep(1)
    print('ordinary1', os.getcwd())

@preserve_dir
async def ordinary2():
    os.chdir("/")
    print('ordinary2', os.getcwd())
    await asyncio.sleep(0.5)
    print('ordinary2', os.getcwd())
    await asyncio.sleep(0.5)
    print('ordinary2', os.getcwd())

async def main():
    await asyncio.gather(ordinary1(), ordinary2())

print(os.getcwd())
asyncio.run(main())
print(os.getcwd())

output:

/home/user4815162342
ordinary1 /tmp
ordinary2 /
ordinary2 /
ordinary1 /tmp
ordinary2 /
/home/user4815162342

A caveat with this approach is that the directory preservation is tied to the current task. So if you delegate execution to a sub-coroutine, it will observe the modified directory if it's just awaited, but not if it's awaited using await asyncio.gather(coro1(), coro2()).

user4815162342
  • 141,790
  • 18
  • 296
  • 355
1

When your decorator does result = function(*args, **kwargs), it's not getting a result. The wrapped function just returns a coroutine object - it doesn't actually run the function body until you await the coroutine. By restoring the working directory before await, you're restoring the working directory too early.

The whole concept of a "restore this global state at the end" decorator is a recipe for disaster in a concurrent program, though. If two coroutines both use your decorator and interleave their execution, the final directory will depend on how the coroutines are scheduled. The final directory may not be the one you wanted, and one coroutine may have the working directory changed out from under it.

Instead of using a decorator, consider writing your code to be independent of the working directory, or if code needs to set the working directory, run that code in a worker process, so it gets its own working directory.

user2357112
  • 260,549
  • 28
  • 431
  • 505