5

I'm writing a program to run svn up in parallel and it is causing the machine to freeze. The server is not experiencing any load issues when this happens.

The commands are run using ThreadPool.map() onto subprocess.Popen():

def cmd2args(cmd):
    if isinstance(cmd, basestring):
        return cmd if sys.platform == 'win32' else shlex.split(cmd)
    return cmd

def logrun(cmd):
    popen = subprocess.Popen(cmd2args(cmd),
                             stdout=subprocess.PIPE,
                             stderr=subprocess.STDOUT,
                             cwd=curdir,
                             shell=sys.platform == 'win32')
    for line in iter(popen.stdout.readline, ""):
        sys.stdout.write(line)
        sys.stdout.flush()

...
pool = multiprocessing.pool.ThreadPool(argv.jobcount)
pool.map(logrun, _commands)

argv.jobcount is the lesser of multiprocessing.cpu_count() and the number of jobs to run (in this case it is 4). _commands is a list of strings with the commands listed below. shell is set to True on Windows so the shell can find the executables since Windows doesn't have a which command and finding an executable is a bit more complex on Windows (the commands used to be of the form cd directory&&svn up .. which also requires shell=True but that is now done with the cwd parameter instead).

the commands that are being run are

  svn up w:/srv/lib/dktabular
  svn up w:/srv/lib/dkmath
  svn up w:/srv/lib/dkforms
  svn up w:/srv/lib/dkorm

where each folder is a separate project/repository, but existing on the same Subversion server. The svn executable is the one packaged with TortoiseSVN 1.8.8 (build 25755 - 64 Bit). The code is up-to-date (i.e. svn up is a no-op).

When the client freezes, the memory bar in Task Manager first goes blank:

Blacked out memory bar

and sometimes everything goes dark

Frozen

If I wait for a while (several minutes) the machine eventually comes back.

Q1: Is it copacetic to invoke svn in parallel?

Q2: Are there any issues with how I'm using ThreadPool.map() and subprocess.Popen()?

Q3: Are there any tools/strategies for debugging these kinds of issues?

thebjorn
  • 26,297
  • 11
  • 96
  • 138
  • 1
    I suggest don't pack multiple questions into a single one (or do this not so obvious), you will get only answers from people knowing the answer for all of them. – peterh Dec 29 '14 at 10:58
  • Thank you for the feedback. My Qn list are all aspects of the same question though (i.e. if you can't invoke `svn` in parallel then it doesn't matter how I call `TP.map()`), but of course, if someone knows just part of the answer then please answer (I give points to all helpful answers). – thebjorn Dec 29 '14 at 11:03
  • Use `logging`, to see what is happening inside your script. Use an analog of `iotop` on Windows, to see what is happening with your disk. You could install something like [`glances`](https://pypi.python.org/pypi/Glances/) and/or [Sysinternals tools](http://technet.microsoft.com/en-us/sysinternals). Then use the standard debugging technique: read code <-> test your understanding <-> narrow down the problem. Why do you use `shell=True` to run `svn`? (can `subprocess` find `svn` executable without `shell=True`?) – jfs Dec 29 '14 at 11:44
  • What is `_commands`' type? What is your Python version? How large is `argv.jobcount`? Catch exceptions and log them. – jfs Dec 29 '14 at 11:46
  • @J.F.Sebastian I've updated the question with info about the `Popen` parameters. I'm looking into your tool suggestions now. (thx) – thebjorn Dec 29 '14 at 12:37
  • @thebjorn: [I understand how the search for an executable works](http://stackoverflow.com/a/25167402/4279) that is why I've asked whether `svn` is found with `shell=True`. – jfs Dec 29 '14 at 12:50
  • @J.F.Sebastian ah, sorry, I misunderstood. Yes, svn is found with `shell=True`. (it also finds it without `shell=True`, ie. `subprocess.Popen('svn --version').communicate()[0]` gives the expected result). – thebjorn Dec 29 '14 at 15:19
  • then you can drop `shell=sys.platform == 'win32'`. Have you tried to run `check_call("\n".join(["start svn up " + p for p in paths]), shell=True,cwd=curdir, stdin=DEVNULL, stdout=logfile, stderr=STDOUT)`? – jfs Dec 30 '14 at 12:32
  • When I run it in cmd it pops up a new dos box and only runs the first command. When I run under ConEmu it also runs only the first command and then says "Root process was alive less than 10 sec, ExitCode=0". It is a bit important to grab the live command output too to give the user a sense of progress (some of these updates can take several minutes and I can't leave the user staring at a blank screen). You can get an idea of what it looks like from https://dl.dropboxusercontent.com/u/94882440/dksync-output.png – thebjorn Dec 31 '14 at 00:37
  • the point of the `start` commands is to find out whether you can run several `svn up` in the same directory *at all*. You should try `' & '` instead of `'\n'` -- I'm not sure about the syntax. It is a debugging tool, not a suggestion to replace your command. – jfs Dec 31 '14 at 06:30
  • Ah, I see. Yes, it is possible to run several `svn up` very close to each other in time. I've gotten it to work by doing `time.sleep(random.random() * 0.8)` before issuing the `Popen` command, but that seems rather 'hacky'. (ps: 0.8 is just a randomly picked constant and is not tuned at all). – thebjorn Dec 31 '14 at 10:56
  • maybe is `svn.exe up` – valentin Apr 13 '15 at 11:16
  • Not really relevant, but since you mentioned `which`, recent Windows come with a `where` command offering some of the capabilities. – Ben Apr 13 '15 at 17:35

1 Answers1

0

I will do the best that I can to answer all three questions thoroughly, and I welcome corrections to my statements.

Q1: Is it copacetic to invoke svn in parallel?

Copacetic, that is up for determination, but I would say that it's neither recommended nor unrecommended. With that statement, source control tools have specific functionality that requires process and block-level (best guess) locking. The checksumming, file transfers, and file reads/writes require locking in order to process correctly or you risk both duplicating effort and file contention, which will lead to process failures.

Q2: Are there any issues with how I'm using ThreadPool.map() and subprocess.Popen()?

While I don't know the absolute specifics on subprocess.Popen() as I was using it last in 2.6, I can speak about the programmability a bit. What you are doing in the code you creating is creating a pool of one specific subprocess, instead of calling the processes directly. Now off the top of my head, and with my understanding of ThreadPool() is that it does not perform locking by default. This may cause issues with subprocess.Popen(), I'm not sure. Regarding my answer above, locking is something that will need to be implemented. I would recommend looking at https://stackoverflow.com/a/3044626/2666240 for a better understanding of the differences between threading and pooling as I would recommend using threading instead of mutliprocessing. With the nature of source control applications requiring locking, if you are going to parallelise operations while handling locking, you will also need to be able to synchronise the threads so that work is not duplicated. I ran a test a few months back on Linux with multiprocessing, and I noticed that grep was repeating the global search. I'll see if I can find the code I wrote and paste it. With thread synchronisation, I would hope that Python would be able to pass the svn thread status between threads in a way that svn is able to understand so that process duplication is not occuring. That being said, I don't know how svn works under the hood from that aspect, so I am only speculating/making a best guess. As svn is likely using a fairly complicated locking method (I would assert block-level locking and not inode locking but once again, best guess), it would likely make sense to implement semaphore locking instead of lock() or Rlock(). That said, you will have to go through and test various locking and synchronisation methods to figure out which works best for svn. This is a good resource when it comes to thread synchronisation: http://effbot.org/zone/thread-synchronization.htm

Q3: Are there any tools/strategies for debugging these kinds of issues?

Sure, threading and multiprocessing should both have logging functionality that you can utilise in conjunction with logging. I would just log to a file so that you can have something to reference instead of just console output. You should, in theory, be able to just use logging.debug(pool.map(logrun, _commands)) and that would log the processes taken. That being said, I'm not a logging expert with threading or multiprocessing, so someone else can likely answer that better than I.

Are you using Python 2.x or 3.x?

Community
  • 1
  • 1
  • I'm using Python 2.7.3. The ThreadPool starts ~1 thread per `subprocess.Popen` invocation. I use ThreadPool since it is light-weight compared to the process version, and since all it's doing is start processes there isn't any issues with the GIL. Each Popen creates a new svn process on an available cpu core (releasing the GIL so the ThreadPool can continue). Note: the svn invokations run on disjoint repositories so there isn't any contention for any common source files. – thebjorn Apr 13 '15 at 21:30
  • My two hypothesis are (i) svn has some global state somewhere that gets clobbered or (ii) I'm continually blowing away the disk cache..? – thebjorn Apr 13 '15 at 21:30
  • I would assert the global state is what is getting overwritten/interrupted, etc. That'd be my best guess at this point. –  Apr 16 '15 at 16:06