0

Once again programming dos batch commands with Python

There's a great answer to Python threading multiple bash subprocesses?

I was pretty sure this would solve my problem, but I've tried both the Popen and the multiprocessing methods and neither works.

Here's my problem: I want to set a unique value for a Windows environmental variable (like TMP) that is used in each process. So process 1 will write to folder 1 and process 2 will write to folder 2 - and in terms of environmental variables, process 1 won't see what process 2 sees.

Here's my code based on the answer above. Variant 1 uses Windows set VAR=abc method. Variant 2 uses Python's os.environ['TMP']=abc method, and shouldn't work because TMP is accessed by python before it is set by dos. Neither 1 nor 2 works. Variant 3 is a sanity check, and works, but doesn't solve my problem

from subprocess import Popen
import os

commands = [
   # variant 1: does not work
    'set TMP=C:\\Temp\\1 &&  echo abc1 > %TMP%\\1.log', 'set TMP=C:\\Temp\\2 &&  echo abc2 > %TMP%\\2.log' 

   # variant 2: does not work
   # 'set TMP=C:\\Temp\\1 &&  echo abc1 > '+os.environ['TMP']+'\\1.log', 'set TMP=C:\\Temp\\2 &&  echo abc2 > '+os.environ['TMP']+'\\2.log'

   # variant 3: works, but does not set TMP environmental variable
   # 'echo abc1 > C:\\Temp\\1\\1.log', 'echo abc2 > C:\\Temp\\2\\2.log'
]

# run in parallel
processes = [Popen(cmd, shell=True) for cmd in commands]
# do other things here..
# wait for completion
for p in processes: p.wait()

Here's my code for the multiprocessing method (python variable commands is defined in the above script):

from functools import partial
from multiprocessing.dummy import Pool
from subprocess import call
import os

pool = Pool(2) # two concurrent commands at a time
for i, returncode in enumerate(pool.imap(partial(call, shell=True), commands)):
    if returncode != 0:
       print("%d command failed: %d" % (i, returncode))

(I've also tried set TMP=\"C:\\Temp\\1\" with double quotes around the folder)

((Python 2.7.13 64bit on Windows 10))


and not related to the answer I reference, I've tried this function:

import os
from subprocess import check_output

def make_tmp(tmp_path):

    os.environ['TMP'] = tmp_path
    dos = 'echo '+tmp_path+' > '+os.environ['TMP']+'\\output.log'
    check_output(dos, shell=True)


from multiprocessing.dummy import Pool as ThreadPool 
pool = ThreadPool(2) 

print os.environ['TMP']
path_array = ['\"C:\\Temp\\1\"', '\"C:\\Temp\\2\"']

With the following Traceback:

Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird.*
Traceback (most recent call last):
  File "C:\Users\project11\Dropbox\project11\code_current\dev_folder\google_sheets\test_windows_variables.py", line 32, in <module>
    results = pool.map(make_tmp, path_array)
  File "C:\ProgramData\Anaconda2\lib\multiprocessing\pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "C:\ProgramData\Anaconda2\lib\multiprocessing\pool.py", line 567, in get
    raise self._value
subprocess.CalledProcessError: Command 'echo "C:\Temp\1" > "C:\Temp\2"\output.log' returned non-zero exit status 1

* The process can not access the file because it is being used by another process.


Also tried another answer to the same question, no luck.

philshem
  • 24,761
  • 8
  • 61
  • 127
  • 1
    You want to update your environment before making your call, and then use the updated environment in `Popen()` as seen in https://stackoverflow.com/questions/2231227/python-subprocess-popen-with-a-modified-environment – JohanL Sep 11 '17 at 18:29
  • thanks. I want to modify the os.environ['TMP'] in the process - so I think the answer you reference won't work – philshem Sep 11 '17 at 18:31
  • OK, why do you want to do that? To use the updated value in your original program? Because that cannot be done by the use of a subprocess. – JohanL Sep 11 '17 at 18:32
  • The full python script wraps around a Windows executable that writes a temp file `data.bin` to the TMP folder. I want to run multiple processes of this Windows executable, but they all reference the same TMP folder and `data.bin` file. That's why I want to update TMP for each process. – philshem Sep 11 '17 at 18:34
  • I still don't see why you cannot update the environment in your main script and then pass it to the different subprocess calls. You can give a separate copy to each of them, with different values of `TMP`. – JohanL Sep 11 '17 at 18:36
  • I want the processes/sessions to run in parallel. It would be great if provide an example? Thanks! – philshem Sep 11 '17 at 18:42
  • 1
    `Popen` takes an `env` argument that lets you specify the environment to use; don't use `set` in the command itself. – chepner Sep 11 '17 at 19:03
  • @chepner this works, do you want to add it as an answer so I can accept? Had to add PIPE and shell=True from here https://stackoverflow.com/a/36249753/2327328 – philshem Sep 11 '17 at 20:12
  • @philshem Well, that is what I tried to make you do as well, if you had looked more carefully at the answer I linked to. ;-) – JohanL Sep 11 '17 at 20:19
  • yes, that would have saved time. Thanks – philshem Sep 11 '17 at 20:22

2 Answers2

1

Here's the working code, thanks to Chepner AND JohanL comments

import subprocess, os

def make_tmp(tmp_path):

    my_env = os.environ.copy()
    my_env['TMP'] = tmp_path
    dos = 'echo '+tmp_path[-1]+' > '+my_env['TMP']+'\\output.log'
    # or run a bat script
    # dos = 'C:\\Temp\\launch.bat'
    subprocess.Popen(dos, env=my_env, stdout=subprocess.PIPE, shell=True)

from multiprocessing.dummy import Pool as ThreadPool 
pool = ThreadPool(2) 

path_array = ['C:\\Temp\\1', 'C:\\Temp\\2', 'C:\\Temp\\3']

results = pool.map(make_tmp, path_array)

note that the path_array has to be existing folders

philshem
  • 24,761
  • 8
  • 61
  • 127
  • This should not require `subprocess.PIPE` If your use case is only to redirect the output of your command to a certain file you do not need to use a specific environment at all, though. That is better done in Python itself (which, then, would require `subprocess.PIPE`but not `shell=True`). – JohanL Sep 12 '17 at 03:31
1

If your use case is only to capture the output to a file, you can let Python do that for you, by redirecting the command output to a subprocess.PIPE and then store the data from Python. This has the advantage that you do not need to use shell=True and thereby save yourself a possible vulnerability and an extra process creation. That could be written as:

import subprocess, os

def make_tmp(tmp_path):
    dos = ['echo', tmp_path[-1]]
    outfile = os.path.join(tmp_path, 'output.log')
    cmd_proc = subprocess.Popen(dos, stdout=subprocess.PIPE)
    with open(outfile, 'w') as f:
        f.write(cmd_proc.communciate[0])

from multiprocessing.dummy import Pool as ThreadPool 
pool = ThreadPool(2) 

path_array = ['C:\\Temp\\1', 'C:\\Temp\\2', 'C:\\Temp\\3']

results = pool.map(make_tmp, path_array)

Caveat: I have not had the time to test above code, so there could be some minor error.

JohanL
  • 6,671
  • 1
  • 12
  • 26
  • Thanks for the answer. Actually the echo is a test case, and full python script wraps around a Windows executable that writes a temp file data.bin to the TMP folder. I want to run multiple processes of this Windows executable, but they all reference the same TMP folder and data.bin file. That's why I want to update TMP for each process. – philshem Sep 12 '17 at 06:47