Multi-threading in Python and shell

Question

I have a shell script called "nn.sh" which takes an ip address in a local network and SSH into that ip and then continuously read off some data from that ip and appends the results to a file called "log.txt" on the server.

I need to write a Python code to run on the server which probably use multi-threading to run this script in one thread and then in another thread reads the values already available in the file "log.txt". How do I do this?

I have written the following code:

#!/usr/bin/python
import threading
import time
from subprocess import call, Popen, PIPE

exitFlag = 0

class loggerThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
        print "Logging thread started ..."
    def run(self):
        with open("log.txt","at") as log: 
                call(['/bin/sh', 'nn.sh', '172.20.125.44', '10'], stdout = log)

class readerThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
        print "Reading thread started ..."
    def run(self):
        while 1:
                with open("log.txt","r") as log:
                        lines = log.read().split("\n")
                        print "Reader thread  ..."
                        print lines[-1]


thread1 = loggerThread()
thread2 = readerThread()

thread1.start()
thread2.start()

Here is also the contents of "nn.sh":

ssh -o IdentityFile=~/Dropbox/ALI/id_rsa -l root $1 <<EOF
    while :; 
        do date; 
           echo; 
           iwlist wlan0 scan 
           echo; 
           echo "#################################################################"; 
           sleep $2; 
        done;
EOF

However if I run this code, nothing will be stored in "log.txt". Any ideas on how to fix this?

Honestly, you should do this kind of thing in Go rather than Python, which was made for this sort of thing. It would be a lot easier to write and maintain. — , Aug 26 '14 at 22:50
@mmstick: This is easy to do in Python too, he's just not doing it right. Learning a new language is not going to be the easiest way to solve it. — abarnert, Aug 26 '14 at 22:50
@abarnert If you already know how to code in Python, you can literally master all of Go's API and syntax in a day. It's not *that* hard. On the flip side, it's a good idea to learn new programming languages, especially langauges that are much faster at doing the same job. — , Aug 26 '14 at 22:52
@mmstick: "Faster" in what sense? Performance of reading off a pipe is hardly likely to matter. And anyone who knows Python can write this code just as easily in Python as in Go. I agree it's worth learning a variety of different languages, but telling people "You can't do it in that language" when you very easily can is not helping. — abarnert, Aug 26 '14 at 22:57
@abarnert: I never stated that he 'can't do it in that language' but that Go is better suited for this kind of task. Go is faster in all senses over Python -- both in signficantly less memory overhead and the benefits of a compiled language. — , Aug 26 '14 at 23:04
@mmstick: Running this in Python, even with output being spammed as fast as possible, takes 0.0% CPU according to `top`. So the performance doesn't make any difference. (Besides, there are plenty of things that are much faster in Python—try porting any NumPy app, or most gevent-based servers, to Go and see what you get.) — abarnert, Aug 26 '14 at 23:07
@abarnert I agree with you. I don't want to switch the whole project to Go because of this simple issue which is easily fixable in Python. Thanks to both of you for your comments. — Mohsen Karimzadeh Kiskani, Aug 26 '14 at 23:23

abarnert · Accepted Answer · 2014-08-27T00:24:43.617

Trying to stream information from one process (or thread) to another through a file like this is a bad idea. You have to get the writer to make sure to flush the file after every line (and to not flush the file in mid-line), you have to synchronize things so that you only read the file when there's new data to read instead of spinning as fast as possible, you have to figure out how to detect when there's new data to read (which is platform-specific), etc.

This is exactly what pipes are for. In fact, given this line:

from subprocess import call, Popen, PIPE

… I suspect you copied and pasted from some code that does things with a pipe, because otherwise, why would you be importing PIPE?

Also, I'm not sure why you think you need two threads here. If you had to send input to the shell script and also read output, using two threads might make that easier. But all you're trying to do is kick off a child process and read its output as it becomes available. So just read from its pipe in the main thread.

There are examples in the docs that show how to do exactly what you want.

from subprocess import call, Popen, PIPE
p = Popen(['/bin/sh', 'nn.sh', '172.20.125.44', '10'], stdout=PIPE)
for line in p.stdout:
    print line
p.wait()

Iterating over a pipe will block until another line is available, then read the entire line, as long as none of your lines are longer than select.PIPE_BUF (which is guaranteed to be at least 512).

If you need to put this into a thread for some other reason (e.g., you need to do some other work in the main thread, while this thread is collecting output), you do that the exact same way as any other threading in Python: either create a threading.Thread subclass, or pass a target function. For example:

def nnsh():
    from subprocess import call, Popen, PIPE
    p = Popen(['/bin/sh', 'nn.sh', '172.20.125.44', '10'], stdout=PIPE)
    for line in p.stdout:
        print line
    p.wait()
t = threading.Thread(target=nnsh)
t.start()
# do a bunch of other stuff
t.join()

(Of course you can make it a daemon thread, or come up with a way to signal it instead of joining it with an infinite timeout, etc.; read a basic tutorial or the threading module docs if you know what you want but don't know how to do it, and post a separate new question if you get stuck somewhere.)

If you may have to deal with ridiculously long lines, on some *nix platforms (although it doesn't seem to be necessary on at least recent versions of OS X, FreeBSD, Linux, or Solaris…), you may have to loop manually instead:

buf = ''
while True:
    buf += p.stdout.read(select.PIPE_BUF)
    lines = buf.split('\n')
    for line in lines[:-1]:
        print line
    buf = lines[-1]

If you want it to also be written a file just put a | tee log.txt after it. — JohnB, Aug 26 '14 at 23:08
Actually when I tried this code, I realized that I really need multi-threading because "nn.sh" contains an infinite while loop so in your code, anything after second line cannot be executed because the program gets stuck in an infinite loop in the second line. Any thought on how to fix it? — Mohsen Karimzadeh Kiskani, Aug 26 '14 at 23:37
@MohsenKiskani: If you have some other work that you need to do, while this loop is going on, then yes, you have to move either this loop or the other work to a thread. But you still don't need to put the "running the program" and "getting the input" onto two separate threads. — abarnert, Aug 26 '14 at 23:41
Assuming that I want to do it in two different threads, how should I do this? — Mohsen Karimzadeh Kiskani, Aug 26 '14 at 23:48
@MohsenKiskani: The same way you do any threading in Python. If you don't understand even the basics of how to launch threads, you need to learn; using magic code that you don't understand is always a bad idea, but it's _especially_ a bad idea when it comes to multithreading. — abarnert, Aug 27 '14 at 00:25

score 0 · Answer 2 · answered Aug 26 '14 at 23:02

The process doesnt have to be multithreaded from Python but from shell. Put your shell script inside a function and call it appending a amperstand (&) to call it in another process. You can kill it finding the PID. Then iterate over the log file and print anything when it is written to the file.

johntellsall · Answer 3 · 2014-08-27T00:01:18.857

This is a variation of @abarnert's concept. It runs the "nn.sh" command in a subprocess, then processes each line of data as it appears. Output is written to sys.stdout, then flushed, so we can see it as it comes, vs all at the end.

source

#!/usr/bin/env python

# adapted from http://stackoverflow.com/questions/2804543/read-subprocess-stdout-line-by-line

import subprocess, sys, time

def test_ping():
    proc = subprocess.Popen(
        ['bash', './nn.sh', 'localhost', '3'],
        stdout=subprocess.PIPE,
    )
    outf = sys.stdout
    for line in iter(proc.stdout.readline, ''):
        outf.write( 'Reader thread ...\n' )
        outf.write( line.rstrip() + '\n' )
        outf.flush()

if __name__=='__main__':
    test_ping()

Multi-threading in Python and shell

3 Answers3

source