4

Trying to figure out how to iterate through a .txt file (filemappings.txt) line by line, then split each line using tab(\t) as a delimiter so that we can create the directory specified on the right of the tab (mkdir -p).

Reading filemappings.txt and then splitting each line by tab

server/ /client/app/
server/a/   /client/app/a/
server/b/   /client/app/b/

Would turn into

mkdir -p /client/app/
mkdir -p /client/app/a/
mkdir -p /client/app/b/

Would xargs be a good option? Why or why not?

phil o.O
  • 406
  • 2
  • 4
  • 20
  • 3
    BTW -- filenames are allowed to contain tabs (or newlines!), so this isn't a good file format to use for completely arbitrary names; in general, lists of untrusted names should always be NUL-delimited. – Charles Duffy Jan 17 '18 at 16:38

5 Answers5

5
cut -f 2 filemappings.txt | tr '\n' '\0' | xargs -0 mkdir -p 

xargs -0 is great for vector operations.

Joshua
  • 40,822
  • 8
  • 72
  • 132
  • 2
    You probably meant `-f2`? And I think you can use just `xargs -d'\n' ...`. I don't see any point in turning `\n` into `\0` and then using `xargs -0` (maybe there is a difference?). – PesaThe Jan 17 '18 at 16:35
  • @PesaThe: yeah -f1 was a mistake. I turned '\n' into '\0' out of habit rather than knowing there's a reason to do that. – Joshua Jan 17 '18 at 16:46
  • 2
    Because [it wasn't obvious to the OP](https://stackoverflow.com/questions/48305724/bash-script-to-mkdir-on-each-line-of-a-file-that-has-been-split-by-a-delimiter) perhaps point out that `xargs` will run `mkdir -p` as few times as possible with multiple directories as its arguments. This works fine, and is a good optimization for commands which accept an arbitrary list of file or directory names as arguments; but obviously is less ideal in some other scenarios. – tripleee Jan 17 '18 at 20:45
  • Sorry, wrong link; the other question is https://stackoverflow.com/q/48308990/874188 – tripleee Jan 17 '18 at 20:51
3

You already have an answer telling you how to use xargs. In my experience xargs is useful when you want to run a simple command on a list of arguments that are easy to retrieve. In your example, xargs will do nicelly. However, if you want to do something more complicated than run a simple command, you may want to use a while loop:

while IFS=$'\t' read -r a b
do
  mkdir -p "$b"
done <filemappings.txt

In this special case, read a b will read two arguments separated by the defined IFS and put each in a different variable. If you are a one-liner lover, you may also do:

while IFS=$'\t' read -r a b; do mkdir -p "$b"; done <filemappings.txt

In this way you may read multiple arguments to apply to any series of commands; something that xargs is not well suited to do.

Using read -r will read a line literally regardless of any backslashes in it, in case you need to read a line with backslashes.

Also note that some operating systems may allow tabs as part of a file or directory name. That would break the use of the tab as the separator of arguments.

tripleee
  • 175,061
  • 34
  • 275
  • 318
Javier Elices
  • 2,066
  • 1
  • 16
  • 25
  • 1
    So, there are a couple of things worth mentioning. 1) Quote: `"$b"` 2) `IFS=\t` won't work. You have to use `IFS=$'\t'` 3) You are changing `IFS` for the rest of the script (that can mess up things). Declare it just for the `read` cmd: `IFS=$'\t' read ...` 4) `read` without `-r` will mange backslashes. – PesaThe Jan 17 '18 at 17:32
  • @PesaThe, I have tested my answer and `IFS=\t` works for me. I have also tried `IFS=$'\t'` and it does not work for me. You are right that defining the variable for a single statement is the way to go, but I have tried it with the `while` statement and it does not work for me. About the quotes around `$b`, you are totally right. Fixed, thanks. – Javier Elices Jan 17 '18 at 17:41
  • @PesaThe, tested `IFS=$'\t'` in the right place... :-) Now it works. I have fixed that too; as you say, changing `IFS` for the rest of the script is not ideal. Thanks again. – Javier Elices Jan 17 '18 at 17:49
  • `IFS=\t` shouldn't work as it splits on literal `t` character :) See this [snippet](https://ideone.com/n8uFe2). Final note: consider adding the `-r` option so that you allow dirs that contain backslashes. – PesaThe Jan 17 '18 at 17:53
  • 1
    @PesaThe, consideration accepted, thanks! I have changed the explanation to include the `-r` option. – Javier Elices Jan 17 '18 at 17:59
1
sed -n '/\t/{s:^.*\t\t*:mkdir -p ":;s:$:":;p}' filemappings.txt | bash

  1. sed -n: only work with lines that contains tab (delimiter)
  2. s:^.*\t\t*:mkdir -p :: change all things from line beggning to tab to mkdir -p
  3. | bash: tell bash to create folders
Bach Lien
  • 1,030
  • 6
  • 7
  • `xargs` runs its command as few times as possible without creating a command line that is too long, so in the case of Joshua's answer, it would run `mkdir` just once, whereas your solution runs it once per line. – Benjamin W. Jan 17 '18 at 17:29
  • Plus it won't work if there is `"` in the path and it can be used to execute arbitrary code! – PesaThe Jan 17 '18 at 17:39
1

As others have pointed out, \t character could also be a part of the file or directory name, and the following command may fail. Assuming the question represents the true form of the input file, one can use:

  $ grep -o -P '(?<=\t).*' filemappings.txt | xargs -d'\n' mkdir -p

It uses -P perl-style regex to get words after the \t(TAB) character, then use -d'\n' which provides all relevant lines as a single input to mkdir -p.

iamauser
  • 11,119
  • 5
  • 34
  • 52
0

With GNU Parallel it looks like this:

parallel --colsep '\t' mkdir -p {2} < filemapping.txt
Ole Tange
  • 31,768
  • 5
  • 86
  • 104