4

I am building a little script to update application files on a raspberry pi.

It will do the following:

  1. Download a zip file of the application files
  2. Unzip them
  3. Copy each one to the right place and make it executable etc as needed.

The problem i'm having is that one of the files is updatescript.sh.

I've read that it is dangerous to update / change a bash script while it is executing. See Edit shell script while it's running

Is there a good way to achieve what I'm trying to do?

user230910
  • 2,353
  • 2
  • 28
  • 50
  • 1
    Copy yourself script to /tmp, rerun yourself from /tmp, then do the work. – KamilCuk Jan 09 '19 at 00:32
  • But ... if as step 1 i copy myself to /tmp, then i rerun from /tmp, i'll overwrite myself.. *my brain hurts* – user230910 Jan 09 '19 at 00:35
  • Overwriting yourself is fine, as long as it's a different inode with the same filename. The only thing that's unsafe is *changing the file in-place*. – Charles Duffy Jan 09 '19 at 01:02
  • 1
    You should consider packaging your files, and installing + updating your package instead of doing this – midor Jan 09 '19 at 01:08
  • 1
    Very much that (which is to say, what @midor said). Proper package management gives you uninstalls, signature validation, verification support, and lots of other goodies. And it also prevents the user from needing to trust that you know what you're doing with respect to the security aspects of building the update infrastructure, which are nontrivial and important. – Charles Duffy Jan 09 '19 at 01:28
  • 3
    Proper package management also means that the mechanisms a user uses to control the rest of their software installation (ensuring that test and production environments match, auditing which software is installed where, etc etc) works consistently for everything installed. As soon as you roll your own, you're breaking compatibility with tools that depend on standardized mechanisms, breaking customers who have their regular update mechanisms customized to work with their network security mechanisms, etc etc etc. – Charles Duffy Jan 09 '19 at 01:29
  • 2
    ...so, you might want to offer the option of an out-of-band auto-update mechanism for customers who don't have it together enough to be managing things themselves, but if you make that the *only* way to install your software without an opt-out mechanism or means of generating a standalone package, you'll see some of your more sophisticated users giving you the side-eye or avoiding you outright. – Charles Duffy Jan 09 '19 at 01:31
  • 1
    Thanks for the perspective @CharlesDuffy ! It makes complete sense, and, is a more sustainable approach. – user230910 Jan 09 '19 at 01:57
  • 1
    Even, if you need an out-of-band update it is probably better to base it on a package, because most packages will make it easy to update the right locations (e.g. BSD or arch packages are typically just a tar-archive that is untared to / with proper internal structure). I think @CharlesDuffy should make an answer out of his comments, and that should be the accepted answer. If you are running this anywhere in prod, using the script is just irresponsible, but if it is private use, and you don't download the script you should be good, and it may be easier for you to maintain. – midor Jan 09 '19 at 12:43
  • 1
    @midor, ...an argument against, and I *do* understand it, being that if you have users on N different architectures, supporting every single package manager that exists out there is unpleasant. Personally, I'm using Nix for new projects (a package manager used natively in its own NixOS, but which can be deployed on other Unixlikes as well, either by an unprivileged user for a home-directory installation or installed globally); effectively models software builds as Merkle trees by hashes of dependencies/build steps/etc, which a lot of very nice characteristics fall out of. – Charles Duffy Jan 09 '19 at 12:58
  • 1
    (If you're using a hash of "do we have a version built with build steps having ${this hash}, with dependency X having ${this hash} and dependency Y having ${this hash}" to lookup your binary downloads or store your builds, that makes it very easy to keep your binaries distinct from each other, ensure that rebuilds against new chains happen if-and-only-if needed... and don't leak anything about customer configurations when a binary lookup request is a cache miss, because all it sent over-the-wire was a sha256). – Charles Duffy Jan 09 '19 at 13:04
  • 1
    You are essentially resorting to a platform-independent package manager, because you know how painful not having a package manager is. If you need to support multiple platforms not using a package manager is even worse, because random updates to your dependencies break you, and you can go figure from the user reports what went wrong, and still can't prevent it from happening. The only legitimate use is personal use, everything else is effectively tech debt that will haunt you later on, because dependencies break. – midor Jan 09 '19 at 14:24
  • 1
    Agreed wholeheartedly. Nix is only useful because there's nixpkgs, a huge library of descriptions of how to build various potential dependencies (so one can ensure, on any platform, that one's dependencies are locally installed and built in a consistent way); it's absolutely a package manager, and I'd never think of doing serious work without one (without moving to a completely different paradigm, such as distributing complete, immutable, read-only system images -- look at the ChromeOS model, or casync/desync in combination with mkosi, or the like). – Charles Duffy Jan 09 '19 at 14:40
  • 1
    Wow, what a useful and informative discussion! I'm learning a LOT from you guys, thanks! :) – user230910 Jan 09 '19 at 21:47

2 Answers2

7

What you've read is badly overblown.

It's completely safe to overwrite a shell script in-place by mving a different file over it. When you do this, the old file handle is still valid, referring to the original unmodified file contents. What you can't safely do is edit the existing file in-place.

So, the below is fine (and is what all your OS-vendor update tools like RPM do in effect):

#!/usr/bin/env bash
tempfile=$(mktemp "$BASH_SOURCE".XXXXXX)
if curl https://example.com/whatever >"$tempfile" &&
   curl https://example.com/whatever.sig >"$tempfile.sig" &&
   gpgv "$tempfile.sig" "$tempfile"; then
  chown --reference="$BASH_SOURCE" -- "$tempfile"
  chmod --reference="$BASH_SOURCE" -- "$tempfile"
  sync # force your filesystem to fully flush file contents to disk
  mv -- "$tempfile" "$BASH_SOURCE" && rm -f -- "$tempfile.sig"
else
  rm -f -- "$tempfile" "$tempfile.sig"
  exit 1
fi

...whereas this is risky:

curl https://example.com/whatever >/usr/local/bin/whatever

So do the first, thing, not the second one: When downloading a new version of your script, write that to a different file, and only rename it over the original when the download succeeded. That's what you want to do anyhow to ensure atomicity.

(There are also some demonstrations of code-signing validation practices above because, well, you need them when building an updater. You wouldn't be trying to distribute code via an automated download without verifying a signature, right? Because that's how one simple breakin to your web server results in every single one of your customers being 0wned. The above expects the public side of your code-signing keys to be in ~/.gnupg/trustedkeys.gpg, but you can put trustedkeys.gpg in any directory and point to it with the environment variable GNUPGHOME).


Even if you don't write your update code safely, the risk is still trivial to mitigate. If you move the body of your script into a function, such that it has to be completely read before any part of it can be executed, then there's no part of the file that isn't already read at the time when execution begins.

#!/usr/bin/env bash
main() {
  echo "Logic all goes here"
}; { main "$@"; exit; }

Because { main "$@"; exit; } is part of a compound command, the parser reads the exit before it starts executing the main, so it's guaranteed that no further source-file content will be read after main exits, even if some future bash release didn't handle input line-by-line in the first place.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • 1
    Thanks for the additional signature checks, the security considerations are always important! I'm planning on going down the "make a package" route recommended above in the comments tho.. – user230910 Jan 09 '19 at 02:02
1

Basically do something along:

shouldbe="/tmp/$(basename "$0")"
if [ "$0" != "$shouldbe" ]; then
    cp "$0" "$shouldbe"
    exec env REALPATH="$0" "$shouldbe" "$@"
fi
  1. Check if you are running from a temporary directory
  2. If you are not, copy yourself and rerun from the temporary directory

You can even pass some variables/state along, by using environmental variables or arguments. Then you can update yourself by using simple cp, as the old path isn't sourced (or even opened) anymore.

 cp "new_script_version.sh" "$REALPATH"

The script simply looks like this:

#!/bin/bash

# we need to be run from /tmp directory
shouldbe="/tmp/$(basename "$0")"
if [ "$0" != "$shouldbe" ]; then
    cp "$0" "$shouldbe"
    exec env REALPATH="$0" "$shouldbe" "$@"
fi

echo "Updatting...."
echo "downloading zip files"
echo "unziping zip files..."
echo "Copying each zip files etc."
cp directory"new_updatescript.sh "$REALPATH"
echo "Update succedded"

Live/test version available at tutorialspoint.

One would also implement some flock locking to the scripts just in case.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 1
    *grumbles about trusting `$0`* -- especially when not handling the case when its value starts with a dash. See [BashFAQ #28](https://mywiki.wooledge.org/BashFAQ/028). And use `mv`, not `cp`: `cp` isn't atomic, whereas `mv` is (if both source and destination are on the same filesystem). – Charles Duffy Jan 09 '19 at 01:11
  • 1
    ...and because of that "on the same filesystem" limitation, using `/tmp` for this is generally a bad idea. – Charles Duffy Jan 09 '19 at 01:13
  • I like this approach, it seems clear and understandable. Thanks for the additional info on the mv command! – user230910 Jan 09 '19 at 02:01
  • Also, because we don't check if the `cp` succeeded, we could be executing _someone else's_ program in `/tmp`. – Charles Duffy Sep 29 '20 at 20:16