14

I've written a Python script that downloads and converts many images, using wget and then ImageMagick via chainedsubprocess calls:

for img in images: 
  convert_str = 'wget -O  ./img/merchant/download.jpg %s; ' % img['url'] 
  convert_str += 'convert ./img/merchant/download.jpg -resize 110x110 ' 
  convert_str += ' -background white -gravity center -extent 110x110' 
  convert_str += ' ./img/thumbnails/%s.jpg' % img['id']
  subprocess.call(convert_str, shell=True)

If I run the content of convert_str manually at the command line, it appears to work without any errors, but if I run the script so it executes repeatedly, it sometimes gives me the following output:

--2013-06-19 04:01:50--  
http://www.lkbennett.com/medias/sys_master/8815507341342.jpg
Resolving www.lkbennett.com... 157.125.69.163
Connecting to www.lkbennett.com|157.125.69.163|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22306 (22K) [image/jpeg]
Saving to: `/home/me/webapps/images/m/img/merchant/download.jpg'

 0K .......... .......... .                               100% 1.03M=0.02s

2013-06-19 04:01:50 (1.03 MB/s) - 
`/home/annaps/webapps/images/m/img/merchant/download.jpg' saved [22306/22306]

/home/annaps/webapps/images/m/img/merchant/download.jpg 
[Errno 2] No such file or directory: 
' /home/annaps/webapps/images/m/img/merchant/download.jpg'

Oddly, despite the No such file or directory message, the images generally seem to have downloaded and converted OK. But occasionally they look corrupt, with black stripes on them (even though I'm using the latest version of ImageMagick), which I assume is because they aren't completely downloaded before the command executes.

Is there any way I can say to Python or to subprocess: "don't run the second command until the first has definitely completed successfully?". I found this question but can't see a clear answer!

Community
  • 1
  • 1
Richard
  • 62,943
  • 126
  • 334
  • 542
  • 1
    The funny thing to me is the extra space in front of `/home` in the error line ... (although I can't see where it comes from) – mgilson Jun 19 '13 at 11:34

2 Answers2

24

Normally, subprocess.call is blocking.

If you want non blocking behavior, you will use subprocess.Popen. In that case, you have to explicitly use Popen.wait to wait for the process to terminate.

See https://stackoverflow.com/a/2837319/2363712


BTW, in shell, if you wish to chain process you should use && instead of ; -- thus preventing the second command to be launched if the first one failed. In addition, you should test the subprocess exit status in your Python program in order to determine if the command was successful or not.

Community
  • 1
  • 1
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • @mgilson Isn't that what I said? Feel free to edit if my English is not good enough. – Sylvain Leroux Jun 19 '13 at 11:40
  • Oh ... sorry. It looks like that is what you said. Hmmm ... I must be going crazy this morning. (Some reason the first time I read it, I got the impression that you were implying that `;` wouldn't necessarily block until the process finished...But on after re-reading, I didn't get that impression the second time around). – mgilson Jun 19 '13 at 11:42
  • Thanks - I didn't know that about `&&`. From what you're saying, I think I'll separate the `wget` and `convert` commands into two separate calls, and then check the output of the `wget` command before running `convert`. – Richard Jun 19 '13 at 11:51
  • I don't mind about the commands being blocking, that's not a problem. – Richard Jun 19 '13 at 11:51
  • @mgilson No problem. My English being far from perfect, it is sometime only comprehensible ... to myself ;) – Sylvain Leroux Jun 19 '13 at 12:22
4

See Using module 'subprocess' with timeout

Not sure if this is the proper way of doing it, but this is how I accomplish this:

import subprocess
from threading import Thread

def call_subprocess(cmd):
    proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    out, err = proc.communicate()
    if err:
        print err

thread = Thread(target=call_subprocess, args=[cmd])
thread.start()
thread.join() # waits for completion.
Community
  • 1
  • 1
bnlucas
  • 1,724
  • 1
  • 13
  • 18