0

Please correct me if I am wrong here. A goal: to copy a file by spawning into another process (so actual copying is not "locking" the process that calls it).

cmd = ['cp', '/Users/username/Pictures/2Gb_ImageFile.tif', '/Volume/HugeNetworkDrive/VerySlow/Network/Connection/Destination.tif']

def copyWithSubprocess(cmd):        
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

copyWithSubprocess(cmd)
alphanumeric
  • 17,967
  • 64
  • 244
  • 392
  • 1
    What do you mean by "locking" the process? This will still wait for the child process to terminate. (Edit: the question has been edited; this comment made more sense when the "communicate" call was still included) – Jeremy Roman Feb 27 '14 at 20:20
  • whats the question ? ... subprocess should not block until you call `comunnicate` or some other method that must wait ... – Joran Beasley Feb 27 '14 at 20:21
  • @JeremyRoman why would that wait for the subprocess to finish? (Im almost positive it does not and continues executing immediatly after spawning) – Joran Beasley Feb 27 '14 at 20:22
  • 1
    @JoranBeasley: Question was edited between when I commented and when you saw it. :) – Jeremy Roman Feb 27 '14 at 20:22
  • ahhh gotcha ... I was really confused for a minute :P – Joran Beasley Feb 27 '14 at 20:23
  • 1
    You didn't state a problem. Is there a bug? An exception? Give us some more specifics – mdscruggs Feb 27 '14 at 21:55

2 Answers2

1

The easiest way to handle complicated asynchronous processes in Python is to use the multiprocessing library, which was designed specifically to support such tasks and has an interface that closely parallels that of the threading module (indeed I have written code that can switch between multi-threading and multi-processing operations mostly by importing one library or the other, but this required fairly rigorous limits on which parts of the modules were utilized).

[Edit: removed spurious advice about threading and made my opening assertion less bombastic]

holdenweb
  • 33,305
  • 7
  • 57
  • 77
  • It would be interesting to see how multiprocessing could be used with subprocess when used to copy multiple files. Would you post any simple example of multprocessing when used in conjunction with subprocess? I am posting a simple code below that could be modiefed to make it multy-processing/multy-threated. A simple example would be more that sufficient! – alphanumeric Feb 27 '14 at 23:16
  • `Popen` is "asynchronous". It is the way to call external processes in Python. – jfs Feb 27 '14 at 23:23
  • It is _one_ way to call asynchronous processes in Python. Given your reminder, I will remove the second part of my answer and modify the first. If you want to be able to communicate picklable Python objects to your subprocesses then the subprocess module just won't do it - it was designed to mimic the same sort of process interactions that the shell does. [Edited to add my understanding of the difference between subprocess and multiprocess] – holdenweb Feb 28 '14 at 02:28
  • That should have been "if you want to communicate __non__-picklable objects" – holdenweb Mar 24 '14 at 18:54
1

Popen(cmd, stdout=PIPE, stderr=PIPE) won't "lock" your parent process.

cmd may stall itself if it generates enough output due to the full pipe buffers. If you want to discard subprocess' output then use DEVNULL instead of PIPE:

import os
from subprocess import Popen, STDOUT

DEVNULL = open(os.devnull, 'wb') #NOTE: it is already defined in Python 3.3+
p = Popen(cmd, stdout=DEVNULL, stderr=STDOUT)
# ...

if you want to process the output without blocking the main thread then you could use several approaches: fcntl, select, named pipes with iocp, threads. The latter is a more portable way:

p = Popen(cmd, stdout=PIPE, stderr=PIPE, bufsize=-1)
bind(p.stdout, stdout_callback)
bind(p.stderr, stderr_callback)
# ...

where bind() function:

from contextlib import closing
from functools import partial
from threading import Thread

def bind(pipe, callback, chunksize=8192):
    def consume():
        with closing(pipe):
            for chunk in iter(partial(pipe.read, chunksize), b''):
                callback(chunk)
    t = Thread(target=consume)
    t.daemon = True
    t.start()
    return t

You don't need an external process to copy a file in Python without blocking the main thread:

import shutil
from threading import Thread

Thread(target=shutil.copy, args=['source-file', 'destination']).start()

Python can release GIL during I/O so the copying happens both concurrently and in parallel with the main thread.

You could compare it with a script that uses multiple processes:

import shutil
from multiprocessing import Process

Process(target=shutil.copy, args=['source-file', 'destination']).start()

If you want to cancel the copying when your program dies then set thread_or_process.daemon attribute to True.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • There are 'stdout_callback' and 'stderr_callback' variables being sent as the arguments to bind() function.... bind(p.stdout, stdout_callback) and bind(p.stderr, stderr_callback). If possible would you please clarify where those variable we've got from? Thanks in advance – alphanumeric Feb 28 '14 at 00:35
  • @Sputnix: It is your functions that process subprocess' stdout/stderr. Do you see `callback(chunk)` in the code? For example: `def stdout_callback(chunk): print("Got %d bytes on stdout" % len(chunk))` – jfs Feb 28 '14 at 00:39
  • Please take a look at the code I posted at the bottom of this page... It is a code I try to run. It gives me an error: "NameError: global name 'stdout_callback' is not defined" – alphanumeric Feb 28 '14 at 01:28
  • @Sputnix: It is **your** function. You must define it "if you want to process the output". See the previous comment on how you could do it. If you don't know what do you want to do with the subprocess' output then use the variant with `DEVNULL`. – jfs Feb 28 '14 at 01:34
  • I've got it now! Thanks again! – alphanumeric Feb 28 '14 at 01:41