3

I'd like to send the contents of filename to some_cmd on the command line. What's the difference between running this:

$ cat filename | some_cmd

and

$ some_cmd < filename

Are there cases where I can or should use one and not the other?

Ashton Wiersdorf
  • 1,865
  • 12
  • 33
  • In the first, you are creating two processes with a pipe between them. In the second, you are running only one process and there is no pipe. In the latter, you can expect `fseek(stdin,...)` to succeed, but in the former you should expect seeking on stdin to fail. – William Pursell Oct 17 '18 at 19:40
  • 1
    Possible duplicate of [Useless use of cat?](https://stackoverflow.com/questions/11710552/useless-use-of-cat) – xhienne Oct 17 '18 at 19:54
  • 1
    @xhienne, ...if the answers there weren't so heavily influenced by popularity rather than correctness... – Charles Duffy Oct 17 '18 at 19:57
  • @CharlesDuffy I do not agree with every opinion there but at least it may best answer Ashton's question "Are there cases where I can use one and not the other?" – xhienne Oct 17 '18 at 20:02
  • @xhienne Thanks for pointing that question out. I think it has some useful information. However, I'm not asking about when `cat` is "uselessly used"—I'm asking about the difference between the two methods and when it would be appropriate to use one over the other. (See my updated question.) @CharlesDuffy's response nails this on the head. The other question doesn't quite answer my question. – Ashton Wiersdorf Oct 17 '18 at 20:14
  • @Ashton The page I pointed at (and the answers there) is not about "when cat is uselessly used", it's mainly about the difference between the two syntaxes, the inner working of the shell in each case, and when it's appropriate to use cat. That's your question, and Charles' answer would be very well suited there. – xhienne Oct 17 '18 at 20:17
  • @xhienne On closer reading, you're right: some of the answers there do to some degree answer my question. However, the *question* itself is significantly different from mine—people just happened to answer both questions. Furthermore, the question itself assumes prior knowledge of what I was asking about. Someone with my same question and no knowledge of the "UUOC award" I think would be able to find this page easier. It's a similar question, but not a duplicate. – Ashton Wiersdorf Oct 17 '18 at 20:29
  • Ashton, don't take it bad if this question is closed. Don't take it personally either. What matters is that, for a given question, the reader is offered a set of good answers. Charles' answer is good but only gives one single technical point of view (and I personally do agree with him). OTOH, by not linking to the other question, we miss some equally interesting answers that are worth reading in my opinion. Closing would just redirect the reader. – xhienne Oct 17 '18 at 20:38
  • @xhienne I won't be offended. :) I agree it's good to link the question. Thanks for doing that. I feel it's also important to make the answers easily discoverable with questions that are close to what readers might have in mind—within reason of course. I think that my question text itself is sufficiently different from the other question to stand on its own, so I don't think it should be closed, but whatever—either way, I got my question answered! – Ashton Wiersdorf Oct 17 '18 at 20:44
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/182045/discussion-between-ashton-wiersdorf-and-xhienne). – Ashton Wiersdorf Oct 17 '18 at 20:50
  • Possible duplicate of [What is the difference between "$(cat file)", "$( – jww Oct 18 '18 at 02:43

1 Answers1

8
  • cat foo | somecmd is running two programs—/bin/cat, and somecmd; and connecting the stdout of cat to the stdin of somecmd with a FIFO—which can be read only once, from start to back. That FIFO also doesn't expose metadata about the original file—neither its name nor its size can be discovered by somecmd (without, for size, reading all the way to the end; this makes cat foo | tail ridiculously slow for a multi-GB file).

  • somecmd <foo is running only one program—somecmd—connecting its stdin to a direct handle on the file foo. It can thus copy that handle, rewind and reread it, hand out subsets of the file to different threads to process in parallel, map the file into memory for random access, etc.

Common programs like GNU sort, wc -c, tail and shuf can run much more efficiently when given a real, seekable file handle rather than a FIFO.

Always use redirection directly from a file rather than cat'ing that file unless you have a specific and compelling reason to do otherwise.


As an example of such a compelling reason (where you might want to use cat), consider the case where you need to stream a file only readable by a more-privileged user account.

sudo -u someuser /bin/cat -- /path/to/somefile | somecmd

...lets somecmd run with your original, non-escalated privileges, so /etc/sudoers can be configured to allow the original command to run only that single, specific cat invocation.

Ashton Wiersdorf
  • 1,865
  • 12
  • 33
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441