0

Normally an executable file cannot be overwritten while a process is started from this file. At any time the process could try to reload a missing code section.

Is there a possibility to circumvent/break this lock?

My process does mlockall() so all code pages are already loaded.

The goal is the process (a long running task) should update itself with as little downtime as possible.

After download, a execl(argv[0],NULL) should activate the updated code.

pgsellmann
  • 13
  • 2
  • 2
    Using `mv` to substitute the runtime and reexec is not the way? – KamilCuk Oct 19 '18 at 18:07
  • 1
    A running process will not access the executable file on disk to load code since the code has been loaded into memory at startup. That means the executable file on disk isn't locked. – hek2mgl Oct 19 '18 at 18:11
  • In order to avoid downtime, you'll usually run multiple instances of a service behind a load balancer and upgrade them one by one having at any point of the upgrade at least enough instances running to handle incoming requests. – hek2mgl Oct 19 '18 at 18:13
  • 2
    Welcome to StackOverflow. @KamilCuk has the answer. Learn to search for answers before posting, because this question is answered several times: https://stackoverflow.com/questions/3365347/how-can-we-overwrite-exe-files-while-users-are-running-them , https://stackoverflow.com/questions/9162969/how-can-a-c-binary-replace-itself , https://stackoverflow.com/questions/9162969/how-can-a-c-binary-replace-itself – Jeff Learman Oct 19 '18 at 18:44
  • 1
    Possible duplicate of [How can a C++ binary replace itself?](https://stackoverflow.com/questions/9162969/how-can-a-c-binary-replace-itself) – Jeff Learman Oct 19 '18 at 18:45
  • 1
    Also be sure to read the advice here: https://stackoverflow.com/questions/6695496/rename-a-running-executable-exe-file – Jeff Learman Oct 19 '18 at 18:46

2 Answers2

1

To me, you are wrong on several points:

  • first you can overwrite a binary that is running on Linux (more specifically a binary whose file is stored on a ext{2,3,4} file system); the reason is simple, as long as there is still a file descriptor opened on a file, the inodes asociated to this file are kept 'allocated' by the driver until the last file descriptor is closed on it and then blocks are freed. So there are no risks of finding data of the file.
  • Code is mapped in memory are process startup, thus there cannot be missing code (and as mmap uses a file descriptor, even in case of lazy mapping, the whole file remains "mappable".
  • mlockall is used to lock pages in memory, thus preventing swap, this has little to do with locking file on file system.

In the end, nothing prevents you from doing what you asked for.

OznOg
  • 4,440
  • 2
  • 26
  • 35
  • Does the filesystem matter in overwritting a "running" binary? Can't you do that on any filesystem? – KamilCuk Oct 19 '18 at 18:46
  • 1
    the file system (kernel) does not care of the running process. Nevertheless, this obviously depends on the filesystem you use as this kind of "keep all inodes" is clearly an implementation choice. (on windows, for example, this is not allowed, thus I imagine you cannot do that on Linux using a ntfs partition) – OznOg Oct 19 '18 at 18:50
1

Normally an executable file cannot be overwritten while a process is started from this file. Is there a possibility to circumvent/break this lock?

Yes. Do not overwrite the executable; replace it.

That is, you save the new executable under a temporary name in the same directory (or anywhere in the same file system -- must be on the same mount!), then either rename() or link() the temporary file over the executable.

In a shell script, you can use mv -f newbinary oldbinary, if both newbinary and oldbinary are in the same file system and mount. In a Bash script, you might use something like

#!/bin/bash
BINDIR=/usr/bin

# Autoremoved work directory
Work="$(mktemp -d)" || exit 1
trap "cd / ; rm -rf '$Work'" EXIT

# ... Check if new binaries available ...
#     Otherwise: exit 0

# ... Download new binaries under "$Work/" ...

# Copy 'executable' to $BINDIR, under a temporary name
tempbin="executable.$PID-$RANDOM$RANDOM$RANDOM"
if ! mv -f "$Work/executable" "$BINDIR/$tempbin" ; then
    # Failed
    exit 1
elif ! mv -f "$BINDIR/$tempbin" "$BINDIR/executable" ; then
    # Failed
    exit 1
fi

# Successfully replced.
exit 0

This works on all POSIXy systems, because file name is completely separate of the inode that specifies its contents, access mode, ownership, timestamps, and so on.

In practice, the kernel will retain the old inode for as long as there are executables running it, or any process has it open. However, the file name will immediately point to the new inode, with the new executable contents. So, essentially, the rename/link simply changes which inode the file name refers to. That is also why the temporary file must reside on the same filesystem (same mount).

The goal is the process (a long running task) should update itself with as little downtime as possible.

It is a common security hole to allow a process to change itself. Typically, it is not even allowed at all in POSIXy systems, unless the process is run with superuser privileges (i.e., as root or in Linux, with CAP_FOWNER capability). You do NOT want to do this.

(Just because it is common to do so, for example with PHP web stuff, does not make it sane or safe. If it did, then we'd have to agree that excrement tastes good, because there are billions of flies and dung beetles that think so. If you take a look, you'll find that such web services ALL have had severe security problems, some directly related to this update mechanism. Some maintainers of said package claim that problems during updates, like man-in-the-middle attacks, are the users' fault, not theirs, though. They're wrong, of course.)

Instead, you should have a separate, privileged service that periodically checks for updates, and when found, retrieves the new version using the above replacement method. In the simplest case, this can simply be run from cron or similar.

If your users really want you to, you can create a minimal C daemon that periodically checks if a new version is available. You can have it receive on a specific Unix domain datagram address, so that your executable can send a single character to it (no matter which user it is run as) for the update daemon to do a check then and there (unless it has checked recently enough). Essentially, it'll just wait (say, using select()) for enough time to elapse, or a specific request to check. When it is time, it'll run a shell script to check if a new executable is available (say, using popen() etc.; the typical location to save such scripts is in /usr/lib/yourservice/). If the script responds that a new version is available, run another script to download and replace the binary. If the process receives a SIGHUP signal, do the check immediately; if it receives a SIGTERM signal, exit. That way it can be run as a service, and won't consume much resources when running.

In your long-running executable, if it is at a point where it can replace itself with a newer version, use stat() on /proc/self/exe and argv[0], to verify if they have the same st_dev and st_ino. If they do not, then the update service has provided a newer version of the executable, and your service can run

    if (argv[0][0] == '/')
        execv(argv[0], argv);
    else
        execvp(argv[0], argv);

or, if you define the absolute path to your executable at compile time in say exepath, then

    execvp(exepath, argv);

to replace itself with the newer version.

Do note that such a process should close all open file descriptors (except for standard streams; 0, 1, and 2, or STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO), when it starts up. (That is, close all open file descriptors between 3 and sysconf(_SC_OPEN_MAX), inclusive.) This is because exec*() functions do not close file descriptors (other than those marked O_CLOEXEC/FD_CLOEXEC), so any descriptors that might be open at time of exec will be left open. Doing it this way also means that if exec fails, your service can continue running normally.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
  • This is the exact solution for my problem. I was not aware that open();write();close() is handled differently from rename() in this context. – pgsellmann Oct 22 '18 at 10:45