How do I ensure that only one copy of a daemon is running?

Question

My code to daemonize a process is:

static int daemonize( const char *lockfile )
{
    pid_t pid, sid, parent;
    int lfp = -1;
    char buf[16];

    /* already a daemon */
    if ( getppid() == 1 ) return 1;

    /* Each copy of the daemon will try to 
     * create a file and write its process ID 
     * in it. This will allow administrators 
     * to identify the process easily
     */ 
    /* Create the lock file as the current user */
    if ( lockfile && lockfile[0] ) {
        lfp = open(lockfile,O_RDWR|O_CREAT,LOCKMODE); 
        if ( lfp < 0 ) {
            syslog( LOG_ERR, "unable to create lock file %s, code=%d (%s)",
                    lockfile, errno, strerror(errno) );
            exit(EXIT_FAILURE);
        }
    }

    /* If the file is already locked, then to ensure that 
     * only one copy of record is running. The filelock function will fail 
     * with errno set to EACCESS or EAGAIN.
     */
    if (filelock(lfp) < 0) {
        if (errno == EACCES || errno == EAGAIN) {
            close(lfp);
            //return(1);
            exit(EXIT_FAILURE);
        }
        syslog(LOG_ERR, "can't lock %s: %s", lockfile, strerror(errno));
        exit(EXIT_FAILURE);
    }
    ftruncate(lfp, 0);
    sprintf(buf, "%ld", (long)getpid());
    write(lfp, buf, strlen(buf)+1); 

    /* Drop user if there is one, and we were run as RUN_AS_USER */
    if ( getuid() == 0 || geteuid() == 0 ) {
        struct passwd *pw = getpwnam(RUN_AS_USER);
        if ( pw ) {
            syslog( LOG_NOTICE, "setting user to " RUN_AS_USER );
            setuid( pw->pw_uid );
        }
    }

    /* Trap signals that we expect to recieve */
    signal(SIGCHLD,child_handler);
    signal(SIGUSR1,child_handler);
    signal(SIGALRM,child_handler);

    /* Fork off the parent process */
    pid = fork();
    if (pid < 0) {
        syslog( LOG_ERR, "unable to fork daemon, code=%d (%s)",
                errno, strerror(errno) );
        exit(EXIT_FAILURE);
    }
    /* If we got a good PID, then we can exit the parent process. */
    if (pid > 0) {
        /* Wait for confirmation from the child via SIGTERM or SIGCHLD, or
           for two seconds to elapse (SIGALRM).  pause() should not return. */
        alarm(2);
        pause();

        exit(EXIT_FAILURE);
    }

    /* At this point we are executing as the child process */
    parent = getppid();

    /* Cancel certain signals */
    signal(SIGCHLD,SIG_DFL); /* A child process dies */
    signal(SIGTSTP,SIG_IGN); /* Various TTY signals */
    signal(SIGTTOU,SIG_IGN);
    signal(SIGTTIN,SIG_IGN);
    signal(SIGHUP, SIG_IGN); /* Ignore hangup signal */
    signal(SIGTERM,SIG_DFL); /* Die on SIGTERM */

    /* Change the file mode mask */
    umask(0);

    /* Create a new SID for the child process */
    sid = setsid();
    if (sid < 0) {
        syslog( LOG_ERR, "unable to create a new session, code %d (%s)",
                errno, strerror(errno) );
        exit(EXIT_FAILURE);
    }

    /* Change the current working directory.  This prevents the current
       directory from being locked; hence not being able to remove it. */
    if ((chdir("/")) < 0) {
        syslog( LOG_ERR, "unable to change directory to %s, code %d (%s)",
                "/", errno, strerror(errno) );
        exit(EXIT_FAILURE);
    }

    /* Redirect standard files to /dev/null */
    freopen( "/dev/null", "r", stdin);
    freopen( "/dev/null", "w", stdout);
    freopen( "/dev/null", "w", stderr);

    /* Tell the parent process that we are A-okay */
    kill( parent, SIGUSR1 );
    return 0;
}

I want to run only one instance of my program at a time when I start it using:

service [script] start

But whenever this command executes two or more times, it creates the same number of daemon processes in the running condition. I want to get rid of this behavior. Any suggestion will be highly appreciated.

Read [this answer](http://stackoverflow.com/questions/688343/reference-for-proper-handling-of-pid-file-on-unix) and be very careful about race conditions and errors. — Emmet, Apr 03 '14 at 17:51

score 2 · Accepted Answer · answered May 06 '11 at 21:11

2

Don't use a file lock; instead, use the O_EXCL flag to open(), which will fail with EEXIST if the file already exists. This is normally done with the pid file, since it needs to be exclusive anyway.

answered May 06 '11 at 21:11

geekosaur

59,309
11
123
114

@geekosaur But in filelock() with fcntl function I am opening pid file in F_WRLCK mode that is already exclusive write lock. So that should be the same as O_EXCL mode in open(). – Sushant Jain May 06 '11 at 21:19
@Sushant: That may be, but it's doing things the harder (and hence more prone to errors and race conditions) way. In general I prefer the simpler ways with fewer steps that can go wrong, rather than having to track down odd problems such as you're running into. Locking should be used when interleaving access to a file among multiple processes. (You also didn't show the `filelock` function so I have no idea if it's correct; there are several gotchas in `fcntl()` locking.) – geekosaur May 06 '11 at 21:24
@Geekosaur Here is my filelock function: `int filelock(int fd){ struct flock fl; fl.l_type = F_WRLCK; /* F_RDLCK, F_WRLCK(an exclusive write lock), or F_UNLCK(unlocking a region) */ fl.l_start = 0; /* offset in bytes relative to l_whence */ fl.l_whence = SEEK_SET; /* SEEK_SET, SEEK _CUR, SEEK _END */ fl.l_len = 0; /* means lock to EOF */ /*fcntl function can change the properties of file that is already open * Here F_SETLK set the record locks define in flock structure var */ return(fcntl(fd, F_SETLK, &fl)); }` – Sushant Jain May 06 '11 at 21:26
1

@Sushant: `O_EXCL` isn't the same thing as a file lock, by the way; it means that the file must not already exist. This is a simpler and therefore more reliable condition. And you are in fact tripping over one of the `fcntl()` locking gotchas: the lock isn't propagated to the child process, so the pid file is unlocked when the parent exits. If you insist on file locking, you need to do it in the child, not the parent. (And you do anyway, as you're otherwise saving the parent's pid which isn't relevant.) – geekosaur May 06 '11 at 21:46
@Geekosaur It's working with `O_EXCL` in open function. Should I remove filelock function or not? – Sushant Jain May 06 '11 at 21:55
I would remove it; if you really want it, as I said it needs to be moved to the child instead of the parent. You definitely want to move writing the pid to the child. – geekosaur May 06 '11 at 22:05
3

Using O_EXCL is *really unreliable* approach to ensure that single daemon is running. If daemon was killed abruptly (SIGKILL, SIGSEGV, power outage, e.t.c), then it will leave a stale "lockfile" and you won't be able to restart your daemon without manually removing this stale lockfile. I have seen this happening in production environment. – user389238 Mar 11 '14 at 20:12

score 0 · Answer 2 · answered Jul 05 '19 at 13:34

0

Another alternative to a pid file is to open a tcp/udp port from your daemon. Running another instance of the daemon will fail when trying to open that same port.

answered Jul 05 '19 at 13:34

jav

583
5
15

How do I ensure that only one copy of a daemon is running?

2 Answers2