134

What is the simplest/best way to ensure only one instance of a given script is running - assuming it's Bash on Linux?

At the moment I'm doing:

ps -C script.name.sh > /dev/null 2>&1 || ./script.name.sh

but it has several issues:

  1. it puts the check outside of script
  2. it doesn't let me run the same script from separate accounts - which I would like sometimes.
  3. -C checks only first 14 characters of process name

Of course, I can write my own pidfile handling, but I sense that there should be a simple way to do it.

codeforester
  • 39,467
  • 16
  • 112
  • 140
  • 1
    seems [here](http://stackoverflow.com/a/7305448/815386) much better way than use lockfile – zb' Feb 23 '13 at 09:40
  • Related: http://stackoverflow.com/questions/185451/quick-and-dirty-way-to-ensure-only-one-instance-of-a-shell-script-is-running-at – codeforester Jan 18 '17 at 19:52

14 Answers14

175

Advisory locking has been used for ages and it can be used in bash scripts. I prefer simple flock (from util-linux[-ng]) over lockfile (from procmail). And always remember about a trap on exit (sigspec == EXIT or 0, trapping specific signals is superfluous) in those scripts.

In 2009 I released my lockable script boilerplate (originally available at my wiki page, nowadays available as gist). Transforming that into one-instance-per-user is trivial. Using it you can also easily write scripts for other scenarios requiring some locking or synchronization.

Here is the mentioned boilerplate for your convenience.

#!/bin/bash
# SPDX-License-Identifier: MIT

## Copyright (C) 2009 Przemyslaw Pawelczyk <przemoc@gmail.com>
##
## This script is licensed under the terms of the MIT license.
## https://opensource.org/licenses/MIT
#
# Lockable script boilerplate

### HEADER ###

LOCKFILE="/var/lock/`basename $0`"
LOCKFD=99

# PRIVATE
_lock()             { flock -$1 $LOCKFD; }
_no_more_locking()  { _lock u; _lock xn && rm -f $LOCKFILE; }
_prepare_locking()  { eval "exec $LOCKFD>\"$LOCKFILE\""; trap _no_more_locking EXIT; }

# ON START
_prepare_locking

# PUBLIC
exlock_now()        { _lock xn; }  # obtain an exclusive lock immediately or fail
exlock()            { _lock x; }   # obtain an exclusive lock
shlock()            { _lock s; }   # obtain a shared lock
unlock()            { _lock u; }   # drop a lock

### BEGIN OF SCRIPT ###

# Simplest example is avoiding running multiple instances of script.
exlock_now || exit 1

# Remember! Lock file is removed when one of the scripts exits and it is
#           the only script holding the lock or lock is not acquired at all.
przemoc
  • 3,759
  • 1
  • 25
  • 29
  • Excellent script; is there a way within this framework to simply check if the lock exists rather than always obtaining a lock when doing so? – Carlos P Dec 04 '13 at 16:24
  • 3
    @CarlosP: No. Under the hood `flock` uses simply [flock(2)](http://man7.org/linux/man-pages/man2/flock.2.html) syscall and it doesn't provide such information nor it even should. If you want to unreliably check, whether there is a lock present (or lack thereof), i.e. without holding it, then you have to try to acquire it in a non-blocking way (`exlock_now`) and release it immediately (`unlock`) if you succeeded. If you think that you need to check the lock presence without changing its state, then you're possibly using wrong tools to solve your problems. – przemoc Dec 04 '13 at 17:56
  • In a more bash way you may replace LOCKFILE="/var/lock/\`basename $0\`" by LOCKFILE="/var/lock/${0##*/}" – Edouard Thiel Feb 11 '14 at 18:49
  • 3
    I feel like I'm missing something obvious, but why does the exec call have to be wrapped in eval? e.g. why `eval "exec $LOCKFD>\"$LOCKFILE\""` rather than `exec $LOCKFD>"$LOCKFILE"`? – overthink Nov 24 '14 at 14:34
  • 5
    This template is very cool. But I don't understand why you do { _lock u; _lock xn && rm -f $LOCKFILE; }. What is the purpose of the xn lock after you just unlocked it? – George Young Jan 08 '15 at 19:50
  • 1
    @EdouardThiel Yeah. I actually try to avoid any bashisms, but your `basename` equivalent is not a BASH thing, it's [POSIX thing](http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02), so it's fine indeed. BTW I also don't use backticks anymore and prefer much more sane `$()` notation. [Why is $(...) preferred over backticks?](http://mywiki.wooledge.org/BashFAQ/082) – przemoc Jan 31 '15 at 21:56
  • 5
    @overthink only literal number next to `>` is considered as file descriptor number, so without `eval` there `exec` tries to execute binary called `99` (or whatever else is put in `$LOCKFD`). It's worth to add that some shells (like `dash`) have a bug that requires fd number to be single digit. I chose high fd number to avoid possible collisions (they depend on the use case, though). I went with BASH also because of convenient `EXIT` condition in trap IIRC, but it looks I was wrong, as [it is part of POSIX shell](http://pubs.opengroup.org/onlinepubs/009695399/utilities/trap.html#tag_04_146). – przemoc Jan 31 '15 at 22:20
  • 2
    @GeorgeYoung Removing lock file at the end of script only if immediate exclusive lock succeeds (after earlier unlocking) is for the case of other script instance patiently waiting to obtain exclusive or shared lock (i.e. using `exlock` or `shlock`), because when it finally starts, then previous instance shouldn't remove the file. – przemoc Jan 31 '15 at 22:54
  • Important update about `trap fn EXIT`. It must be supported by any POSIX-compliant shell, as I already wrote 6 months ago, but the thing is it's differently implemented. We simply want to have `fn` be executed when script ends (be it normal exit or invoked via some signal). EXIT works that way in bash or ksh, but not in zsh or dash for instance. That's why I originally went with bash despite having otherwise quite clean sh script. I relearned this trap issue a few months ago (I have to finally start blogging to ease relearning things), but forgot to write about it here as well, sorry for that! – przemoc Jul 31 '15 at 14:45
  • This solution doesn't automatically clean up stale lockfiles if the process dies. Simple test: add a long sleep to the bash script, run it in the background, kill it. Stale lockfile will exist. Try running the script again, it will immediately exit because of the lockfile. – Jay Paroline Oct 21 '15 at 17:30
  • 3
    @JayParoline You're misinterpreting what you observe. When you kill (`-9`) script, i.e. bash instance running script file, it will surely die, but processes `fork()`+`exec()`-ed from it (like your sleep did) inherit copies of open file descriptors along with `flock()` locks. Killing the script while sleep is sleeping won't unlock, because sleep process is still holding the lock. For lockable script it's important, because you usually want to protect "environment" (do not start another instance while _something_ is still runing). – przemoc Oct 21 '15 at 20:18
  • 3
    @JayParoline But you may change the behavior explained above by adding `( eval "exec $LOCKFD>&-"` before your stuff and `)` after, so everything running within such block won't inherit LOCKFD (and obviously the lock put on it). – przemoc Oct 21 '15 at 20:21
  • @przemoc cool, that makes sense, I just assumed that killing the parent would kill the child, but I never checked.. I ended up going with http://stackoverflow.com/a/1441036/160709 instead, which works the way I need but I can see how this is a more correct solution – Jay Paroline Oct 28 '15 at 05:47
  • @przemoc, incidentally, that's no longer true with bash 4.1 or newer, where `eval` is no longer needed for redirection with FDs from expansion results. – Charles Duffy Jan 02 '16 at 15:58
  • @overthink, (see above -- with new enough bash, the `eval` is in fact no longer necessary). – Charles Duffy Jan 02 '16 at 15:59
  • 3
    @przemoc We've been using this code, and there is a subtle race condition here. It's possible for a 2nd process to acquire a file descriptor on the lockfile between "_lock xn" and " && rm -f $LOCKFILE" when the 1st process exits. The 2nd process is then running, but the lockfile will have been unlinked from the file system by the "rm". A 3rd process can come along, create a new lockfile at the same name, acquire a lock on it, and also start running. After acquiring the lock, you need to check that what you acquired is still on the filesystem - http://stackoverflow.com/questions/17708885/ – Ivan Hamilton Nov 02 '16 at 21:18
  • Note the Stack Overflow contribution terms require contributed code be under MIT (now) and previously under creative commons. So including the GPL is a violation of SO terms. See http://meta.stackexchange.com/q/12527/153541 and http://meta.stackexchange.com/q/272956/153541. – studgeek Jan 10 '17 at 21:24
  • @studgeek Thank you for the info. I planned to change the license long time ago, so I finally did it. Hopefully I'll introduce some other changes I mentioned here or on gist in the near future. – przemoc Jan 10 '17 at 23:09
  • I used this in a script A which launches a background process (script B) with nohup. I found that as long as B is running, the eval statement fails with "flock: 9: Bad file descriptor". Is it possible to use this in such a case? So I want to be able to run only one instance of A no matter how many background B processes are running. – Alexandros Nov 01 '18 at 20:38
  • To clarify: when running A the second time (so there's a B instance active) the _prepare_locking fails with "/var/lock/pserver.sh: Permission denied" and thus the file descriptor is then invalid and locking commands fail. So even though the file /var/lock/myscript.sh has been removed after the first instance of A terminated, it cannot be recreated for the second run... – Alexandros Nov 01 '18 at 20:52
  • The trap on EXIT doesn't work very well in some unusual exit circumstances (especially if you use dash), if you trap on "EXIT INT TERM" then you can cover your bases. – Compholio Aug 18 '21 at 23:11
  • 1
    Directory `/var/lock` may not be writeable, resulting in obscure `flock: 99: Bad file descriptor` errors, maybe good to point this out. – plijnzaad Sep 22 '21 at 17:09
  • Why is `_no_more_locking()` needed? Not removing the lockfile (i.e. not setting the trap at all) does not affect the locking logic. – laur Aug 06 '23 at 13:54
124

If the script is the same across all users, you can use a lockfile approach. If you acquire the lock, proceed else show a message and exit.

As an example:

[Terminal #1] $ lockfile -r 0 /tmp/the.lock
[Terminal #1] $ 

[Terminal #2] $ lockfile -r 0 /tmp/the.lock
[Terminal #2] lockfile: Sorry, giving up on "/tmp/the.lock"

[Terminal #1] $ rm -f /tmp/the.lock
[Terminal #1] $ 

[Terminal #2] $ lockfile -r 0 /tmp/the.lock
[Terminal #2] $ 

After /tmp/the.lock has been acquired your script will be the only one with access to execution. When you are done, just remove the lock. In script form this might look like:

#!/bin/bash

lockfile -r 0 /tmp/the.lock || exit 1

# Do stuff here

rm -f /tmp/the.lock
ezpz
  • 11,767
  • 6
  • 38
  • 39
  • +1. Even if behavior differs across users, OP could use `lockfile`. Just have a separate lockfile for each user or group that's allowed to run their own instance. – outis Nov 11 '09 at 13:39
  • 3
    Can we have an example code snippet? – martin clayton Nov 11 '09 at 13:54
  • 3
    Added an example and skeleton script. – ezpz Nov 11 '09 at 14:26
  • 12
    I don't have lockfile program on my linux, but one thing bothers me - will it work if first script will die without removing lock? i.e. in such case i want next run of script to run, and not die "because previous copy is still working" –  Nov 11 '09 at 15:19
  • That would involve some notion of `try` / `catch`. This is not impossible to fake, but AFAIK is not directly implemented in bash. Any (bash) solution will be confronted with this same problem, however... – ezpz Nov 11 '09 at 17:08
  • @depesz - if you make your lockfile name include the process id somehow (maybe filename.$$) you can check whether a lockfile has gone stale. I'm not saying it's pretty though... – martin clayton Nov 11 '09 at 22:44
  • 3
    You should also use the trap builtin to catch any signals that might kill your script prematurely. Near the top of the script, add something like: trap " [ -f /var/run/my.lock ] && /bin/rm -f /var/run/my.lock" 0 1 2 3 13 15 You can search /usr/bin/* for more examples. – Shannon Nelson Nov 12 '09 at 04:07
  • 8
    @user80168 Current Ubuntu (14.04) has available a package called "lockfile-progs" (NFS-safe locking library) that provides lockfile-{check,create,remove,touch}. man page says: "Once a file is locked, the lock must be touched at least once every five minutes or the lock will be considered stale, and subsequent lock attempts will succeed...". Seems like a good package to use and mentions a "--use-pid" option. – Kalin Apr 17 '14 at 01:43
  • 1
    @user80168 On Debian (at least on Jessie (8) and Stretch (9)) lockfile is part of the procmail package. Moreover it has a "locktimeout" parameter. If locktimeout is given, the program checks if the modification time of the lockfile is older then locktimeout seconds and if this is true then it ignores the lockfile as it must belong to some old, already dead program. – Tylla Apr 11 '18 at 22:54
  • @Kalin Yes, the lockfile-progs suite seems usable as well, but the usage of the lockfile-touch utility as suggested in the manual page of lockfile-create (start lockfile-touch in background which will periodically touch the lock file until killed, do the business in parallel, kill lockfile-touch) is prone to the same error of the "control program dying in mid-air" as any other naive solution. – Tylla Apr 11 '18 at 23:02
  • How it behaves after power loss? – Paul Jan 11 '22 at 11:35
51

I think flock is probably the easiest (and most memorable) variant. I use it in a cron job to auto-encode dvds and cds

# try to run a command, but fail immediately if it's already running
flock -n /var/lock/myjob.lock   my_bash_command

Use -w for timeouts or leave out options to wait until the lock is released. Finally, the man page shows a nice example for multiple commands:

   (
     flock -n 9 || exit 1
     # ... commands executed under lock ...
   ) 9>/var/lock/mylockfile
Jake Biesinger
  • 5,538
  • 2
  • 23
  • 25
  • 1
    I agree, flock is nice, especially compared to lockfile since flock is usually pre-installed on most Linux distros and doesn't require a large unrelated utility like postfix the way lockfile does. – Cerin Nov 17 '15 at 16:57
  • @jake Biesinger am i locking the .sh file or the file that i write output of my application with .sh file? i am new to scripting bash so where do i have to put this in my script plus how to do the unlocking? – A Sahra Dec 18 '16 at 04:28
  • @Cerin I need to do same thing with ffmpeg process conversion so i need to finish the first process regardless of crontab in every minute? pleas i need help for this – A Sahra Dec 18 '16 at 04:31
  • very nice ! thk – Paul Aug 29 '18 at 13:50
  • flock works well until u realised your application didnt terminate or hang. ive to use it together with timeout to limit the execution time or to prevent lock file not being released due to application hang – James Tan Nov 23 '19 at 19:55
14

Use bash set -o noclobber option and attempt to overwrite a common file.

This "bash friendly" technique will be useful when flock is not available or not applicable.

A short example

if ! (set -o noclobber ; echo > /tmp/global.lock) ; then
    exit 1  # the global.lock already exists
fi

# ... remainder of script ...

A longer example

This example will wait for the global.lock file but timeout after too long.

 function lockfile_waithold()
 {
    declare -ir time_beg=$(date '+%s')
    declare -ir time_max=7140  # 7140 s = 1 hour 59 min.
 
    # poll for lock file up to ${time_max}s
    # put debugging info in lock file in case of issues ...
    while ! \
       (set -o noclobber ; \
        echo -e "DATE:$(date)\nUSER:$(whoami)\nPID:$$" > /tmp/global.lock \ 
       ) 2>/dev/null
    do
        if [ $(($(date '+%s') - ${time_beg})) -gt ${time_max} ] ; then
            echo "Error: waited too long for lock file /tmp/global.lock" 1>&2
            return 1
        fi
        sleep 1
    done
 
    return 0
 }
 
 function lockfile_release()
 {
    rm -f /tmp/global.lock
 }
 
 if ! lockfile_waithold ; then
      exit 1
 fi
 trap lockfile_release EXIT
 
 # ... remainder of script ...

This technique reliably worked for me on a long-running Ubuntu 16 host. The host regularly queued many instances of a bash script that coordinated work using the same singular system-wide "lock" file.

(This is similar to this post by @Barry Kelly which was noticed afterward.)

JamesThomasMoon
  • 6,169
  • 7
  • 37
  • 63
  • 2
    One disadvantage of this (as opposed to `flock`-style locking) is that your lock isn't automatically released on `kill -9`, reboot, power loss, etc. – Charles Duffy May 26 '17 at 14:50
  • @CharlesDuffy , you could add a `trap lockfile_release EXIT` which should cover most cases. If power loss is a concern, then using a temporary directory for the lock file would work, e.g. `/tmp`. – JamesThomasMoon Jan 11 '19 at 03:32
  • In addition to reboots &c, exit traps don't fire on SIGKILL (which is used by the OOM killer, and thus a very real-world concern in some environments). I still consider this approach generally less robust to anything where the kernel provides a guarantee of release. (`/tmp` being memory-backed and thus given a hard guarantee of being cleared on reboot is *mostly* the case in recent years, but I'm old-school enough not to trust such facilities to be available; I suppose some rant about kids and a yard is appropriate). – Charles Duffy Jan 11 '19 at 03:39
  • @CharlesDuffy good point, `set -o noclobber` doesn't claim guarantees like `flock`. In my uncommon situation, using `flock` wouldn't work because the script didn't know the target directory it was going process until it had processed some user-passed options (which might also affect the target directory for processing). And, the script might, in some cases, want to release the lock file long before it was done processing. This particular script was used heavily on a multi-host build system. Despite valid concerns, `set -o noclobber` never had a problem of errant multiple lock holders. – JamesThomasMoon Jan 11 '19 at 04:20
  • 2
    I'm not sure I follow why that's a concern; you can certainly grab a lock with a dynamic filename with `flock` after your program has started, and release it without exiting. Using some modern (bash 4.1) facilities to avoid needing to assign a FD manually: `exec {lock_fd}>"$filename" && flock -x "$lock_fd" || { echo "Lock failed" >&2; exit 1; }; ...stuff here...; exec {lock_fd}>&-` – Charles Duffy Jan 11 '19 at 04:47
  • One can also use a code block: `{ flock -x 3 || exit; ...stuff here...; } 3>"$lockfile"` will release the lock on reaching the closing `}`. – Charles Duffy Jan 11 '19 at 04:50
  • I see your point. Perhaps I didn't understand `flock` correctly. I do recall testing various locking strategies with `flock` but deciding that `set -o noclobber` worked better. Unfortunately, I have moved on from that script so I cannot review. – JamesThomasMoon Sep 04 '20 at 23:05
  • 1
    This solution is useful in my case where `flock` and `lockfile` are not available in the environment. – nielsen Jan 20 '21 at 07:43
  • 1
    Note that the trap to delete the file should be before the creation of the file, just in case. – Alexis Wilke Jun 25 '23 at 00:35
  • 1
    @AlexisWilke "_the trap to delete the file should be before the creation of the file_" Often that would make sense in various bash scripts but in this particular case, the script instance may or may not hold the lock file. So if it fails to hold the lock file then it should not attempt to delete the lock file. Your suggestion would cause this script instance to always delete the lock file which will then cause the next script instance to presume it can create the lock file yet there is still a different instance whose lock file was deleted, and there will be multiple script instances running. – JamesThomasMoon Jun 25 '23 at 22:40
4

I'm not sure there's any one-line robust solution, so you might end up rolling your own.

Lockfiles are imperfect, but less so than using 'ps | grep | grep -v' pipelines.

Having said that, you might consider keeping the process control separate from your script - have a start script. Or, at least factor it out to functions held in a separate file, so you might in the caller script have:

. my_script_control.ksh

# Function exits if cannot start due to lockfile or prior running instance.
my_start_me_up lockfile_name;
trap "rm -f $lockfile_name; exit" 0 2 3 15

in each script that needs the control logic. The trap ensures that the lockfile gets removed when the caller exits, so you don't have to code this on each exit point in the script.

Using a separate control script means that you can sanity check for edge cases: remove stale log files, verify that the lockfile is associated correctly with a currently running instance of the script, give an option to kill the running process, and so on. It also means you've got a better chance of using grep on ps output successfully. A ps-grep can be used to verify that a lockfile has a running process associated with it. Perhaps you could name your lockfiles in some way to include information about the process: user, pid, etc., which can be used by a later script invocation to decide whether the process that created the lockfile is still around.

martin clayton
  • 76,436
  • 32
  • 213
  • 198
  • +1 for mentioning `trap` – mgalgs Dec 12 '11 at 21:53
  • What is the 0 signal? It can't be seen in `kill -l` – qed Dec 02 '13 at 21:17
  • 1
    @qed - it means run the trap on exit from the script. See http://www.gnu.org/software/bash/manual/bashref.html#index-trap – martin clayton Dec 02 '13 at 21:33
  • It looks much like the `try...catch...finally...` in python. – qed Dec 03 '13 at 17:52
  • @qed: @martin is right, the documentation states that `trap ... 0` is an alias for `trap ... EXIT`. However, when _sending_ signal `0` with `kill -0 ...`, you just check whether the process exists and you are allowed to send a signal to it. This is used for waiting (polling) for the end of one of your processes that is _not_ the son of the current process. Signal 0 does not have any effect. – hagello May 15 '20 at 08:42
4

i found this in procmail package dependencies:

apt install liblockfile-bin

To run: dotlockfile -l file.lock

file.lock will be created.

To unlock: dotlockfile -u file.lock

Use this to list this package files / command: dpkg-query -L liblockfile-bin

James Tan
  • 1,336
  • 1
  • 14
  • 32
3

first test example

[[ $(lsof -t $0| wc -l) > 1 ]] && echo "At least one of $0 is running"

second test example

currsh=$0
currpid=$$
runpid=$(lsof -t $currsh| paste -s -d " ")
if [[ $runpid == $currpid ]]
then
  sleep 11111111111111111
else
  echo -e "\nPID($runpid)($currpid) ::: At least one of \"$currsh\" is running !!!\n"
  false
  exit 1
fi

explanation

"lsof -t" to list all pids of current running scripts named "$0".

Command "lsof" will do two advantages.

  1. Ignore pids which is editing by editor such as vim, because vim edit its mapping file such as ".file.swp".
  2. Ignore pids forked by current running shell scripts, which most "grep" derivative command can't achieve it. Use "pstree -pH pidnum" command to see details about current process forking status.
yanyingwang
  • 117
  • 6
2

I'd also recommend looking at chpst (part of runit):

chpst -L /tmp/your-lockfile.loc ./script.name.sh
rud
  • 1,012
  • 8
  • 22
1

Ubuntu/Debian distros have the start-stop-daemon tool which is for the same purpose you describe. See also /etc/init.d/skeleton to see how it is used in writing start/stop scripts.

-- Noah

Noah Spurrier
  • 508
  • 5
  • 8
0

One line ultimate solution:

[ "$(pgrep -fn $0)" -ne "$(pgrep -fo $0)" ] && echo "At least 2 copies of $0 are running"
Dm1
  • 56
  • 6
  • 1
    `pgrep -fn ... -fo $0` also matches you text editor which has the script open for editing. Is there a workaround for that situation? – Pro Backup Sep 01 '18 at 04:29
  • This is a very specific solution for situations when traditional ways can't be used, if it doesn't match you needs you still can use a lockfile. If you need this one line solution anyway, you can modify it using $* with $0 and pass unique parameter to your script, which will not be present in a text editor command line. – Dm1 Sep 13 '18 at 10:21
  • 2
    This solution suffers under race conditions: The test construct is not atomic. – Adrian Zaugg Jul 27 '19 at 02:54
0

I had the same problem, and came up with a template that uses lockfile, a pid file that holds the process id number, and a kill -0 $(cat $pid_file) check to make aborted scripts not stop the next run. This creates a foobar-$USERID folder in /tmp where the lockfile and pid file lives.

You can still call the script and do other things, as long as you keep those actions in alertRunningPS.

#!/bin/bash

user_id_num=$(id -u)
pid_file="/tmp/foobar-$user_id_num/foobar-$user_id_num.pid"
lock_file="/tmp/foobar-$user_id_num/running.lock"
ps_id=$$

function alertRunningPS () {
    local PID=$(cat "$pid_file" 2> /dev/null)
    echo "Lockfile present. ps id file: $PID"
    echo "Checking if process is actually running or something left over from crash..."
    if kill -0 $PID 2> /dev/null; then
        echo "Already running, exiting"
        exit 1
    else
        echo "Not running, removing lock and continuing"
        rm -f "$lock_file"
        lockfile -r 0 "$lock_file"
    fi
}

echo "Hello, checking some stuff before locking stuff"

# Lock further operations to one process
mkdir -p /tmp/foobar-$user_id_num
lockfile -r 0 "$lock_file" || alertRunningPS

# Do stuff here
echo -n $ps_id > "$pid_file"
echo "Running stuff in ONE ps"

sleep 30s

rm -f "$lock_file"
rm -f "$pid_file"
exit 0
-2

I found a pretty simple way to handle "one copy of script per system". It doesn't allow me to run multiple copies of the script from many accounts though (on standard Linux that is).

Solution:

At the beginning of script, I gave:

pidof -s -o '%PPID' -x $( basename $0 ) > /dev/null 2>&1 && exit

Apparently pidof works great in a way that:

  • it doesn't have limit on program name like ps -C ...
  • it doesn't require me to do grep -v grep ( or anything similar )

And it doesn't rely on lockfiles, which for me is a big win, because relaying on them means you have to add handling of stale lockfiles - which is not really complicated, but if it can be avoided - why not?

As for checking with "one copy of script per running user", i wrote this, but I'm not overly happy with it:

(
    pidof -s -o '%PPID' -x $( basename $0 ) | tr ' ' '\n'
    ps xo pid= | tr -cd '[0-9\n]'
) | sort | uniq -d

and then I check its output - if it's empty - there are no copies of the script from same user.

-3

Here's our standard bit. It can recover from the script somehow dying without cleaning up it's lockfile.

It writes the process ID to the lock file if it runs normally. If it finds a lock file when it starts running, it will read the process ID from the lock file and check if that process exists. If the process does not exist it will remove the stale lock file and carry on. And only if the lock file exists AND the process is still running will it exit. And it writes a message when it exits.

# lock to ensure we don't get two copies of the same job
script_name="myscript.sh"
lock="/var/run/${script_name}.pid"
if [[ -e "${lock}" ]]; then
    pid=$(cat ${lock})
    if [[ -e /proc/${pid} ]]; then
        echo "${script_name}: Process ${pid} is still running, exiting."
        exit 1
    else
        # Clean up previous lock file
        rm -f ${lock}
   fi
fi
trap "rm -f ${lock}; exit $?" INT TERM EXIT
# write $$ (PID) to the lock file
echo "$$" > ${lock}
Hamish Downer
  • 16,603
  • 16
  • 90
  • 84
-4

from with your script:

ps -ef | grep $0 | grep $(whoami)
ennuikiller
  • 46,381
  • 14
  • 112
  • 137
  • 2
    This has the relatively well known bug with grep finding itself. Of course I can work around it, but it's not something I would call simple and robust. –  Nov 11 '09 at 13:28
  • I've seen many 'grep -v grep's. Your ps might support -u $LOGNAME too. – martin clayton Nov 11 '09 at 13:52
  • it's relatively robust in that its uses $0 and whoami to ensure your gettinmg only the script started by your userid – ennuikiller Nov 11 '09 at 13:57
  • ennuikiller: no - grep $0 will find processes like $0 (for example the one that is running this ps right now), but it will *also* find a grep itself! so basically - it will virtually always succeed. –  Nov 11 '09 at 14:02
  • @depesz, yes of course I'm assuming your doing grep -v grep as well! – ennuikiller Nov 11 '09 at 14:57
  • That's not a bug, that's a feature! Also, `ps -ef | grep [\ ]$0` eliminates finding the grep. – Dennis Williamson Nov 11 '09 at 15:10
  • 1
    @ennuikiller: that assumption was not in your example. besides - it will find "call.sh" even in things like "call.sh". and it will also fail if i'll call it from ./call.sh itself (it will find the call.sh copy that is doing the check, not some previous) - so. in short - this is not solution. it can be changed to be solution by adding at least 2 more greps, or changing existing one, but it doesn't on its own solve the problem. –  Nov 11 '09 at 15:21
  • depesz is correct. Instead use pgrep with -u or -U options and exclude $$ for current process id – Angelo Jun 28 '13 at 21:29