17

I have a file containing command lines that I want to run. This file contains around 2,000 lines.

I have 8 cores available. Is it possible to parse the file and start 8 processes, then execute another one from the file whenever one of the programs finishes? I want this to continue until the end of file is reached.

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
monkeyking
  • 6,670
  • 24
  • 61
  • 81

3 Answers3

41

Use GNU parallel. It's an incredibly powerful tool and official packages exist for about 20 or so linux distros. What's that? You have an excuse as to why you can't use it? Here's a simple example showing how to run a list or file of commands in parallel:

Contents of jobs.txt:

sleep 1; echo "a"
sleep 3; echo "b"
sleep 2; echo "c"

Command:

time parallel :::: jobs.txt

Results:

a
c
b

real    0m3.332s
user    0m0.170s
sys     0m0.037s

Notes:

If you wish to keep the order the same as the input, pass the -k flag to GNU parallel.

If you have more than eight cores and only wish to process with eight cores, add -j 8 to the args list.

The man page is a good read, but if you haven't already read this tutorial I would highly recommend the time investment.

Steve
  • 51,466
  • 13
  • 89
  • 103
  • GNU parallel isn't available on Ubuntu 11.10, but it's available on Debian sid. It's also worth noting that this example will *not* work as-is with the parallel utility from the moreutils package, which has some different semantics. – Todd A. Jacobs Jul 15 '12 at 06:58
  • -j 8 is not needed - it is automatically detected. Ubuntu package: https://build.opensuse.org/package/binaries?package=parallel&project=home%3Atange&repository=xUbuntu_11.10 – Ole Tange Jul 16 '12 at 13:21
  • In order to run commands from a file in parallel you could do `cat /path/to/file.txt | parallel` – Mr Purple Apr 06 '16 at 06:48
26

You can use xargs to read in the file, while limiting the maximum number of processes to the number of available cores. For example:

cores=$(fgrep -c processor /proc/cpuinfo)
xargs --arg-file=/tmp/foo \
      --max-procs=$cores  \
      --replace \
      --verbose \
      /bin/sh -c "{}"
Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
  • 1
    Thanks! This method is better than all other answers. Setting the processes manually often leads to either under-performance or throttling. – thechargedneutron Oct 19 '21 at 05:09
0

You can start new processes on the background simply by running a command with &. There is an example here describing a solution of your problem.

aphex
  • 3,372
  • 2
  • 28
  • 56