How can I display intermediate pipeline results for NUL-separated data?

Question

How can I combine the following two commands:

find . -print0 | grep -z pattern | tr '\0' '\n'
find . -print0 | grep -z pattern | xargs -0 my_command

into a single pipeline? If I don't need NUL separators then I can do:

find . | grep pattern | tee /dev/tty | xargs my_command

I want to avoid using a temporary file like this:

find . -print0 | grep -z pattern > tempfile
cat tempfile | tr '\0' '\n'
cat tempfile | xargs -0 my_command
rm tempfile

This question is a follow-up to these answers:

1) Using /dev/tty to display intermediate pipeline results:

https://unix.stackexchange.com/a/178754/8207082

2) Using a NUL-separated list of files:

https://stackoverflow.com/a/143172/8207082

Edited to use my_command instead of command.

Follow-up question:

Makefile rule that writes to /dev/tty inside a subshell?

btw, `command` is actually a standard UNIX command built into all POSIX-compliant shells, so it's ideal to pick a different name (ie. `mycommand`, `yourcommand`) for something intended to be an example or standin. — Charles Duffy, Jun 23 '17 at 22:32

score 5 · Accepted Answer · answered Jun 23 '17 at 22:08

5

You can just change the tee to point to proc sub, then do the exact same thing in there.

   find . -print0 | grep -z pattern | tee >(tr '\0' '\n' > /dev/tty) | xargs -0 command

The only issue with using tee this way, is that if the xargs command also prints to screen, then it is possible for all the output to get jumbled since both the pipe and process sub are asynchronous.

answered Jun 23 '17 at 22:08

123

10,778
2
22
45

Okay, so this works at a prompt, but it doesn't work in a Makefile. Even if I escape the parentheses like so: `$...$`, I get `/bin/sh: /dev/tty): Permission denied`. Do I start a new question or edit the original or leave it? – cjfp Jun 23 '17 at 23:09
OK, follow-up question about Makefiles here: https://stackoverflow.com/questions/44731163 – cjfp Jun 23 '17 at 23:21

randomir · Answer 2 · 2017-06-23T22:09:08.507

1

One of the possibilities:

find . -print0 | grep -z pattern | { exec {fd}> >(tr '\0' '\n' >/dev/tty); tee "/dev/fd/$fd"; } | xargs -0 command

Where we create a temporary file descriptor fd with exec on fly which is connected to tr's stdin via standard process substitution. tee passes everything to stdout (ending on xargs), and a duplicate to a tr subprocess that outputs to /dev/tty.

edited Jun 23 '17 at 22:09

answered Jun 23 '17 at 22:02

randomir

17,989
1
40
55

1

Why not just `tee >(tr '\0' '\n' >/dev/tty)`, instead of using a bunch of functionality that's only available in bash releases that aren't available everywhere (and *specifically* not available out-of-the-box on MacOS)? – Charles Duffy Jun 23 '17 at 22:07
@CharlesDuffy, you're right, that's much simpler. I got carried away. :) – randomir Jun 23 '17 at 22:11
This doesn't work on OS X 10.6.8, the exec doesn't like it: `bash: exec: {fd}: not found`. It works with tee >(...) though! Just to check my understanding, this creates a new stream which tee treats as it's output file. This in turn forms stdin for the subshell. – cjfp Jun 23 '17 at 22:45
@cjfp, ...so, in the `tee >(tr ...)` case, what you're doing is generating a filename which, when written to, will write to a pipe that's read by `tr`, and passing that filename as as argument to `tee` – Charles Duffy Jun 23 '17 at 22:57
@cjfp, ...the original version in the answer here is dynamically allocating a file descriptor and opening it to such a pipeline, and then passing the name `/dev/fd/NN` where `NN` is the number of that file descriptor as an argument to `tee`. That only works on operating systems that support `/dev/fd/` in filenames, and I'm not sure if that's something available on MacOS -- above and beyond the need for bash 4.2 or newer for the automatic-allocation syntax. – Charles Duffy Jun 23 '17 at 22:58
@cjfp, ``exec {fd}>/path/to/file`` will redirect a dynamically allocated file descriptor (e.g., ``3``) in caller's process (e.g. your shell) to the ``/path/to/file``. On Linux, ``/dev/fd`` points to ``/proc/self/fd``, and so by writing to ``echo test >/dev/fd/$fd``, you are really writing to ``/path/to/file``. – randomir Jun 23 '17 at 23:24
@cjfp, btw, ``{varname}`` style automatic file descriptor allocation is [available since bash 4.1-alpha](http://wiki.bash-hackers.org/scripting/bashchanges#redirection_and_related), that's from around 2010. – randomir Jun 23 '17 at 23:30
@randomir @CharlesDuffy I updated from Apple's 3.2 bash to Fink's 4.3.25 bash but it still doesn't work. :( `/dev/fd` has `0 1 2` which are the usual suspects, and directories `4 5` which I'm not sure what are and say `bad file descriptor` when I try to look at them. I appreciate the explanations though – cjfp Jun 23 '17 at 23:39

cjfp · Answer 3 · 2017-06-24T19:48:27.310

-1

It's possible to execute multiple commands with xargs like so:

find . -print0 | grep -z pattern | xargs -0 -I% sh -c 'echo "%"; command "%"'

Source:

https://stackoverflow.com/a/6958957/8207082

Per discussion, the above is unsafe, this is much better:

find . -print0 | grep -z pattern | xargs -0 -n 1 sh -c 'echo "$1"; my_command "$1"' _

edited Jun 24 '17 at 19:48

answered Jun 23 '17 at 21:58

cjfp

153
1
9

It's possible, but this is a dangerous way to do it. If one of the filenames contains the string `$(rm -rf ~)`, you're going to have a very, **very** bad day. – Charles Duffy Jun 23 '17 at 22:08
2

As @CharlesDuffy says, this is open to code injection, you can use `bash -c 'echo "$1"; command "$1" _ {}` to prevent this. – 123 Jun 23 '17 at 22:14
Even better, `xargs -0 sh -c 'for arg; do printf "%s\n" "$arg"; your_command "$arg"; done' _` will reuse each `sh` instance to run more than one command. (re: the replacement of `echo` with `printf`, see the APPLICATION USAGE section of [the POSIX spec for `echo`](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html)). – Charles Duffy Jun 23 '17 at 22:30
(btw, note that I'm **not** using `xargs -I` there at all; that's very intentional, as avoiding it prevents any kind of mangling from taking place). – Charles Duffy Jun 23 '17 at 22:39
Sorry, I don't understand what the `_ {}` or the trailing `_` do. Do I need to worry about quoting if it's just `xargs -0 my_command`? – cjfp Jun 23 '17 at 22:49
1

@cjfp, the `_` is there as a placeholder for `$0`, so that later arguments become `$1`, `$2`, etc (which will be part of `"$@"`, which is what `for` iterates over by default). – Charles Duffy Jun 23 '17 at 22:50
@cjfp, and no, you don't need to worry about quoting for `xargs -0 my_command`, since that doesn't start any shell as a subprocess of xargs; it's the use of `sh -c '...'` that makes this a situation where a bunch of care is needed. – Charles Duffy Jun 23 '17 at 22:51
@123, because its documented and intended behavior is to perform string replacements inside subsequent arguments? And I'm calling that "mangling" because it has no way of understanding the syntax of those arguments (particularly when they're parsed as code) and knowing that the result of those replacements is syntactically correct. – Charles Duffy Jun 24 '17 at 14:07
@CharlesDuffy It replaces the the string that is decided by inserting a quoted argument, exactly the same as running a command i.e `command "$var"`, `command {}`, not sure how that is prone to any "mangling". If you can give an example though, then I'd be interested to see. – 123 Jun 24 '17 at 16:20
@123, when `{}` a separate argument, you're right, it's safe. When it's a **substring** -- `sh -c 'echo "%"; command "%"'` as used by the OP right here -- it's not safe at all, for reasons that should be completely and utterly obvious. The very fact that `-I` *can* substitute into substrings, then, is why I'm objecting -- you need to know what your values are to know they don't have a collision (does `xargs -I foo "$bar" %` actually have a `%` somewhere in the value of `bar`?) -- and it's unnecessary, since the default use *without* `-I` does the safe thing, appending an argument per item. – Charles Duffy Jun 24 '17 at 16:28
@CharlesDuffy It's as safe or dangerous as using a normal bash variable in the exact same situation. In your example that's because they are being interpreted twice since the interpolated values are then run in the `sh` instance. Using `-I` is no more dangerous as shown in my example where it is passed as an argument to `bash` in the same fashion as yours. You don't need to know what is in your args, you just need to not nest processes without taking proper precautions. – 123 Jun 24 '17 at 16:37
@CharlesDuffy In this case , your method not using `-I` would be more performant, due to not running a `sh` instance for every arg, but other than that there will be no difference. – 123 Jun 24 '17 at 16:37
@123 however in this case I am printing a list of filenames, and this answer doesn't work if one of the files is named `"`, for example. `touch \"; find . -print0 | grep -z \" | xargs -0 -I% sh -c 'echo "%"; echo "%"'` outputs `./; echo ./` – cjfp Jun 24 '17 at 17:03
@cjfp I said in a previous comment this answer doesn't work and provided an alternate which would, which still uses `-I` – 123 Jun 24 '17 at 17:45
@123 can you write the full command starting with xargs? I tried using `bash -c 'echo "$1"; command "$1" _ {}` but no luck, even with a `'` after the last `"` – cjfp Jun 24 '17 at 17:56
@123, which case is "this" case? In the case of `xargs -I % sh -c 'echo "%"; command "%"'`, it's absolutely untrue that this is syntactically identical to `xargs -n 1 sh -c 'echo "$1"; command "$1" _` -- see the previously-trotted-out `$(/tmp/run-an-evil-command)` example. – Charles Duffy Jun 24 '17 at 18:43
@cjfp, ...while I was intending to address @123, the `xargs -n 1` example I gave above should work for you. – Charles Duffy Jun 24 '17 at 18:45
@CharlesDuffy I was obviously not comparing it to that command, but my own `bash -c 'echo "$1"; command "$1" _ {}` from the previous comment (see second comment in the thread) which I had previously mentioned I was referring to in the comment before, which you can use `-I` with i.e `xargs -I{} bash -c 'echo "$1"; command "$1" _ {}` – 123 Jun 25 '17 at 00:40

How can I display intermediate pipeline results for NUL-separated data?

3 Answers3

Linked