I have exactly the same situation as yours. In my case, I am running multiple git fetch
command in several repo directories.
In the first trial, the code looks like this (and cmds
is ['git', 'fetch']
):
async def run_async(path: str, cmds: List[str]):
process = await asyncio.create_subprocess_exec(*cmds, cwd=path)
await process.wait()
This function works on one repo, and the caller creates tasks for multiple repos and runs an event loop
to complete them.
Although the program runs and the outcome on disk is correct, the fetch
outputs from different repos are interleaved. The reason is that await process.wait()
could give back control to the caller (the loop scheduler) any time IO blocks (file, network, etc).
A simple change fixes it:
async def run_async(path: str, cmds: List[str]):
"""
Run `cmds` asynchronously in `path` directory
"""
process = await asyncio.create_subprocess_exec(
*cmds, stdout=asyncio.subprocess.PIPE, cwd=path)
stdout, _ = await process.communicate()
stdout and print(stdout.decode())
Here the rationale is to redirect the stdout
so that it's in one place. In my case, I simply print it out. If you need the output, you can return it in the end.
Also, the printing order may not be the same as the start order, which is fine in my case.
The source code is here on github. To give some context, that project is a command line tool to manage multiple git repos, which delegates git command execution from any working directory. There are less than 200 lines of code and it should be an easy read.