1

I want to pipe the output of the command into two commands and paste the results together. I found this answer and similar ones suggesting using tee but I'm not sure how to make it work as I'd like it to.

My problem (simplified):

Say that I have a myfile.txt with keys and values, e.g.

key1   /path/to/file1
key2   /path/to/file2

What I am doing right now is

paste \
  <( cat myfile.txt | cut -f1 ) \
  <( cat myfile.txt | cut -f2 | xargs wc -l )

and it produces

key1 23
key2 42

The problem is that cat myfile.txt is repeated here (in the real problem it's a heavier operation). Instead, I'd like to do something like

cat myfile.txt | tee \
  <( cut -f1 ) \
  <( cut -f2 | xargs wc -l ) \
  | paste

But it doesn't produce the expected output. Is it possible to do something similar to the above with pipes and standard command-line tools?

Tim
  • 7,075
  • 6
  • 29
  • 58
  • How about caching the result of the *"heavier operation"* in order to avoid the problem of doing it twice? – Mark Setchell Dec 09 '22 at 09:10
  • @MarkSetchell definitely it's a practical solution, and this is what I'm considering, but I was curious if split-and-merge is possible via pipes? – Tim Dec 09 '22 at 09:34
  • 1
    what about `fifo`? `mkfifo /tmp/fifo && cat myfile.txt | tee >(cut -f1 >/tmp/fifo) | cut -f2 | xargs wc -l | paste /tmp/fifo - && rm /tmp/fifo` – Phu Ngo Dec 09 '22 at 09:54
  • I think it produces `key1 23 /path/to/file` and `key2 42 /path/to/file2`, due to how `wc -l` works. ;-) But that’s just a detail. (One can do `wc -l < /path/to/file` to _not_ let `wc` know about the file name, but it’s harder to set up with `xargs`. On a similar note, you do not need any of the `cat` commands, that’s a clear [UUOC](https://en.wikipedia.org/wiki/Cat_(Unix)#Useless_use_of_cat) case. (`cat` is for con`cat`enation; not needed for a single file.) – Andrej Podzimek Dec 09 '22 at 11:46
  • @AndrejPodzimek thanks for nitpicking on the details of a toy code example produced for illustrating the problem, which was stripped of all the details that are irrelevant to the problem. – Tim Dec 09 '22 at 11:53
  • @Tim Obviously, it was *not* _“stripped of all the details that are irrelevant to the problem”_, as my nitpick suggests. – Andrej Podzimek Dec 09 '22 at 11:58

1 Answers1

1

This doesn't answer your question about pipes, but you can use AWK to solve your problem:

$ printf %s\\n 1 2 3 > file1.txt
$ printf %s\\n 1 2 3 4 5 > file2.txt
$ cat > myfile.txt <<EOF
key1    file1.txt
key2    file2.txt
EOF
$ cat myfile.txt | awk '{ ("wc -l " $2) | getline size; sub(/ .+$/,"",size); print $1, size }'
key1 3
key2 5

On each line we first we run wc -l $2 and save the result into a variable. Not sure about yours, but on my system wc -l includes the filename in the output, so we strip it with sub() to match your example output. And finally, we print the $1 field (key) and the size we got from wc -l command.

Also, can be done with shell, now that I think about it:

cat myfile.txt | while read -r key value; do
  printf '%s %s\n' "$key" "$(wc -l "$value" | cut -d' ' -f1)"
done

Or more generally, by piping to two commands and using paste, therefore answering the question:

cat myfile.txt | while read -r line; do
  printf %s "$line" | cut -f1
  printf %s "$line" | cut -f2 | xargs wc -l | cut -d' ' -f1
done | paste - -

P.S. The use of cat here is useless, I know. But it's just a placeholder for the real command.

Discussian
  • 492
  • 3
  • 10