1

I have a space-delimited string of files that may or may not be prefixed with "/" (i.e. their paths are relative to a given root):

mydirs='a /b/c /d/e/f g /h/i'

I need to prefix each file with the root provided by the user. I need a portable way of doing this (It also goes without saying that I prefer speed, so builtins are welcome). So we're all on the same page, by "portable" I mean "portable across different linux shells (sh, bash, ksh, zsh, tsh, csh, etc.)"

Currently, my solution is:

root="my_root"  # User-supplied root directory

prefixed_dirs=`echo $mydirs |
tr ' ' '
' | sed 's;^/*;prefix/;' | tr '
' ' '`

which also puts a space at the end of my line, which isn't what I want.

My desired output would be the single line echoed back to me with the files prefixed, like so:

my_root/a my_root/b/c my_root/d/e/f my_root/g my_root/h/i

I've tried searching SO for answers, but I haven't really found what I'm looking for (please point me to a solution if there is one). I know sed operates on new lines by default, so I'm essentially trying to make sed operate on the beginning of substrings (I can't just look for spaces because it will skip a in the above).

This works, but the syntax is ugly. Is there a cleaner way to do this (again, while maintaining portability)? Perhaps a clean awk or perl one-liner (again, must be portable)? Double points for a clean sed one-liner!

Thanks a lot!

adam.hendry
  • 4,458
  • 5
  • 24
  • 51
  • sh and csh are very different languages. I haven't used csh in many years, but I recall that even setting a variable has different syntax. Do you really _really_ need such extreme portability? – glenn jackman May 20 '21 at 16:50
  • @glennjackman TBH, I haven't used csh or even tsh for that matter. By sh I mean dash. Honestly, if you have something that works well for sh, bash, ksh, and zsh (the usual suspects), I'd be a happy camper. – adam.hendry May 20 '21 at 16:52
  • @glennjackman So I guess, no, to answer your question: I don't need that extreme of portability. My current code base tests against sh (i.e. dash), ash, bash, ksh, mksh, and zsh, if that helps. – adam.hendry May 20 '21 at 16:54
  • This works in bash/ksh/zsh, but not dash: `prefixed_dirs="${root}/${mydirs// / $root/}"; prefixed_dirs=${prefixed_dirs//\/\///}` – glenn jackman May 20 '21 at 16:59

2 Answers2

3

You could use the set command to set the string constituents as positional arguments and then run a POSIX supported parameter expansion technique to remove the leading / if present.

Note that, using set on an unquoted variable expansion is also subject to glob expansion. Ensure that there is no scope for that to happen.

root="my_root"
mydirs='a /b/c /d/e/f g /h/i'

set -- $mydirs

for arg; do 
  printf '%s ' "${root}/${arg#/}"
done

Or if you anticipate glob expansion to happen, surround the set command with set -f and set +f to disable and enable glob expansion back.

Inian
  • 80,270
  • 14
  • 142
  • 161
2

The single-line sed is

prefixed_dirs=$(echo "$mydirs" | sed -E "s, , $root/,g; s,^,$root/,; s,//,/,g")

Use $(...) instead of `...` -- see https://github.com/koalaman/shellcheck/wiki/SC2006 for more details.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • Oh very nice! A "two-liner" (i.e. split by ";"). Well done! And you're correct regarding `\`...\``. The only reason I used them is because shellcheck.net seemed to indicate `$()` are not fully portable in some cases (POSIX != portable), but you're right: `$()` is the preferred syntax. Have you ever encountered that? – adam.hendry May 20 '21 at 17:09
  • Correction, "3-liner" (add prefix to all entries separated with a space, then the entry at the beginning of the line, and lastly remove double "//"s). – adam.hendry May 21 '21 at 16:00
  • In my experience, almost entirely on backend development, portability is not a priority: if I'm writing a shell script, it will run on a particular server in a particular OS with a particular shell. If you're writing scripts that you will distribute to users who will try to run them with who-knows-what shell, that's a different story. – glenn jackman May 21 '21 at 16:35
  • If it fits in 80 characters, it's a one-liner no matter how many semicolons it has ;) -- I could have written `sed -E -e "s, , $root/,g" -e "s,^,$root/," -e "s,//,/,g"` with no semicolons. – glenn jackman May 21 '21 at 16:37
  • True! It's for sure a 1-liner. – adam.hendry May 21 '21 at 16:40
  • I was curious why my code base explicitly used backticks instead of `$()`, so I did some digging and found https://stackoverflow.com/questions/4708549/what-is-the-difference-between-command-and-command-in-shell-programming. A user commented that for portability "it is recommended to use backticks for non-nested calls. $(...) needs a recursive parser but this was not used with ksh86 that introduced the feature." Any thoughts on this? – adam.hendry May 21 '21 at 16:42
  • Name a shell you're using and/or coding for that is not POSIX-compliant – glenn jackman May 21 '21 at 16:49
  • Probly just some older ash, mksh, and pdksh shells. Not really a big concern; just handling corner cases. – adam.hendry May 21 '21 at 16:57
  • I just read your earlier comment regarding portability priority. That makes sense, and I think that's why backticks are in my code base: my code is qctually intended to be shared with others and it is uncertain as to which shell they might run. Thanks for your insight! – adam.hendry May 25 '21 at 16:56