1

I have a simple problem I want to solve with a bash script: copy a file, and also copy all the files that are imported in that file, and imported in that file, and so on. This screams recursion.

The files look like this:

import "/path/to/otherfile.txt"
import "/path/to/anotherfile.txt"

information
otherinformation
...

Shouldn't be so hard, here's what I wrote:

#!/bin/bash

destination=/tmp

copy_imports () {
  insfile=$1
  contained_imports=$(grep "import" $insfile | awk -F' ' '{ print $2 }' | sed 's/"//g')
  for imported_insfile in $contained_imports
  do
    copy_imports $imported_insfile
  done

  cp $insfile $destination
}


copy_imports $1

But for some reason, not all files are copied. I see that it is calling the recursion for all the files and nested imports, but not for all of them the cp statement is executed.

I'm totally puzzled, what's going on here?

Thanks a lot!

konse
  • 885
  • 1
  • 10
  • 21
  • 1
    The lack of quoting seems like an obvious explanation. If that's not it, please [edit] to provide a proper [mre]. See also https://shellcheck.net/ and [When to wrap quotes around a shell variable](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Jan 20 '23 at 08:59
  • 2
    You need to make the function's variables (`insfile`, `contained_imports`, and `imported_insfile`) local (e.g. `local insfile="$1"`). Also, I'd recommend checking whether an imported file has already been copied to avoid duplicating work. – Gordon Davisson Jan 20 '23 at 09:07
  • @GordonDavisson can you post this as an answer, making them local is the key. And, yes, quoting and checking should also be included, thanks. – konse Jan 20 '23 at 09:18

1 Answers1

2

The primary problem is that all of the function's variables need to be made local; without that, all invocations of the function are sharing the same global variables, leading to confusion.

Also, I'd recommend checking whether each imported file has already been copied, to avoid duplicating work. I moved the cp command before the recursive loop, which effectively flags the file as "done" as soon as processing begins on it (rather than waiting until it's been fully processed).

As tripleee pointed out, double-quoting variable references is almost always a good idea ($contained_imports is an exception here, since you're counting on word-splitting to break it into separate filenames; if there was a possibility of filenames with spaces, you'd need to use a more robust method). Finally, I couldn't resist replacing the grep | awk | sed pattern with pure awk.

I haven't tested this, but I think it'll work:

#!/bin/bash

destination=/tmp

copy_imports () {
  local insfile="$1"
  if [[ ! -e "$destination/$(basename "$insfile")" ]]; then
    cp "$insfile" "$destination"
    local contained_imports="$(awk -F' ' '/import/ { gsub("\"", "", $2); print $2 }' "$insfile")"
    local imported_insfile
    for imported_insfile in $contained_imports
    do
      copy_imports "$imported_insfile"
    done
  fi
}


copy_imports "$1"
Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151