2

Suppose I have two python scripts methods.py and driver.py. methods.py has all the methods defined in it, and driver.py does the required job when I run it.

Let's say I am in a main directory with the two files driver.py and methods.py, and I have n subdirectories in the main directory named subdir1,subdir2,...,subdirn. All of these subdirectories have files which act as inputs to driver.py.

What I want to do is run driver.py in all these subdirectories and get my output from them, without writing driver.py to disk.

How should I go about this? At the moment, I am using the subprocess module to

  1. Copy driver.py and methods.py to the subdirectories.
  2. Run them.

The copying part is simple:

import subprocess

for i in range(n):
  cmd = "cp methods.py driver.py subdir"+str(i)
  p = subprocess.Popen(cmd, shell=True)
  p.wait()

#once every subdirectory has driver.py and methods.py, start running these codes 

for i in range(n):
  cmd = "cd subdir" + str(i) +" && python driver.py"
  p = subprocess.Popen(cmd, shell=True)
  p.wait()

Is there a way to do the above without using up disk space?

megamence
  • 335
  • 2
  • 10
  • Why not modify driver.py to accept a path to a directory? That way, you can loop over the subdirs and do `python driver.py /path/to/subdir{i}`. You also probably need to make sure driver.py code can work from any input directory. – Gino Mempin Jan 13 '21 at 03:54
  • 1
    @GinoMempin, oh make it accept a whole directory as input? i have not actually done something like this, but i can find something to learn how to do that. thank you! – megamence Jan 13 '21 at 03:59

2 Answers2

3

you might use pythons os.chdir() function to change the current working directory:

import os
#import methods

root = os.getcwd()

for d in ['subdir1', 'subdir2']:
  os.chdir(os.path.join(root, d))
  print("dir:", os.getcwd())
  exec(open("../driver.py").read())

I am also not sure if you need popen, since python is able to execute python files using the exec function. In this case it depends on how you import the methods.py. Do you simply import it or use it somehow else inside of your driver.py ?
You could try to import it at toplevel inside your main script or use an extended path like:

exec(open("../methods.py").read())

inside of your driver script. Keep in mind these solutions are all not very elegant. Best would be processing the path inside of your driver.py as suggested by Gino Mempin. You could call os.chdir() from there.

areop-enap
  • 396
  • 1
  • 7
  • thanks for your answer @areop-enap! i only import methods.py. Does os,chdir(os.path.join(root, d)) change my directory to the subdir1, and exec(open(...).read()) run driver.py INSIDE subdir1? – megamence Jan 13 '21 at 05:49
  • Yes, I tested it with a test.py containing only the function call open('test.txt','wt').write('test') to ensure it is working. The problem with this is: you need methods.py somehow included and since you change the working directory to your subfolders you can't access it with a simple import. You have to do sys.path.append("[absolute path to method.py]") before you do 'import method' insider of your driver.py. You could try importing it before you execute driver.py but I am not sure if this is working. Your methods.py needs to support that somehow. – areop-enap Jan 13 '21 at 09:06
1

Expanding on my initial comment, instead of copying driver.py everywhere, you can just make it accept a path to the subdirectory as a command-line argument. Then, you'll have to also make sure it can do whatever it's supposed to do from any directory. This means taking into account correct paths to files.

There are a number of ways to accept command-line args (see How to read/process command line arguments?). To make it simple, let's just use sys.args to get the path.

Here's a modified driver.py

import sys
from pathlib import Path

# Receive the path to the target subdir as command line args
try:
    subdir_path = Path(sys.argv[1]).resolve()
except Exception as e:
    print('ERROR resolving path to subdir')
    print(e)
    sys.exit(1)

# Do stuff, taking into account full path to subdir
print(f'Running driver on {subdir_path}')
with open(subdir_path.joinpath('input.txt'), 'r') as f:
    data = f.read()
    print(f'Got data = {data}')

Let's say after getting the path, driver.py expects to read a file (input.txt) from each subdirectory. So here you need to get the absolute path to the subdirectory (.resolve()) and to use that when accessing input.txt (.joinpath()). Basically, think that driver.py will always be running from the main dir.

Sample usage would be:

main$ tree
.
├── driver.py
├── method.py
├── subdir1
│   └── input.txt
├── subdir2
│   └── input.txt
└── subdir3
    └── input.txt

main$ python driver.py /temp/main/subdir3
Running driver on /temp/main/subdir3
Got data = 3333

Now, in method.py, you then don't need the "copy driver.py" code. Just loop through all the subdir{n} folders under main, then pass the full path to driver.py. You can still use the same Popen(...) code to call driver.py.

Here's a modified method.py:

import subprocess
from pathlib import Path

# Assume all subdirs are under the current directory
parent_path = Path.cwd()

for child_path in parent_path.iterdir():
    # Skip files/folders not named 'subdir{n}'
    if not child_path.is_dir() or 'subdir' not in child_path.stem:
        continue

    # iterdir returns full paths (cwd + /subdir)
    print(f'Calling driver.py on {child_path}')

    cmd = f'python driver.py {child_path}'
    p = subprocess.Popen(cmd, shell=True)
    p.wait()

Sample run:

main$ python method.py 
Calling driver.py on /temp/main/subdir3
Running driver on /temp/main/subdir3
Got data = 3333

Calling driver.py on /temp/main/subdir2
Running driver on /temp/main/subdir2
Got data = 2222

Calling driver.py on /temp/main/subdir1
Running driver on /temp/main/subdir1
Got data = 1111

Notice also that all execution is done from main.

Some notes:

  • Instead of hardcoded range(n), you can use iterdir or its equivalent os.listdir to get all the files/folders under a base path, which should conveniently return absolute paths. Unless you need to go through the subdirs in a specific order.
  • I prefer using pathlib over os.path because it offers more high-level abstractions to file/folder paths. If you want or need to use os.path, there is this table of os.path - pathlib equivalents.
  • I used f-strings here (see What is print(f"...")) which is only available for Python 3.6+. If you are on a lower version, just replace with any of the other string formatting methods.
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135