4

i have the following command:

cat input.txt | awk '{print $1, $6}' | sort -n | uniq -c | sort -nr | sed 's/"//g'| head -10

i get the desired output, but i get this error

sed: couldn't write 26 items to stdout: Broken pipe

where input.txt is something like:

192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395
127.0.0.1 - - [28/Jul/2006:10:22:04 -0300] "GET / HTTP/1.0" 200 2216

what am i missing

Cyrus
  • 84,225
  • 14
  • 89
  • 153
khinester
  • 3,398
  • 9
  • 45
  • 88
  • 5
    `head` closes the input before `sed` could write it. Just first `head -n10` and then use `sed` or maybe try `tr -d \"`. – KamilCuk Dec 09 '18 at 22:32
  • By creating a test file with 1500 lines as your example, your command runs without error in my Debian equipped with `bash 4.4.23` , `gnu sed 4.5` and `gnu coreutils (head - sort) 8.30` – George Vasiliou Dec 09 '18 at 22:49
  • 1
    Also tangentially avoid the [useless useeof `cat`](/questions/11710552/useless-use-of-cat) – tripleee Dec 09 '18 at 23:21
  • 1
    and if you tell us what it is you're trying to do (in a new question with it's own sample input/output) then I expect someone can show you how do do it without 27 commands, 15 pipes and the batman symbol... – Ed Morton Dec 10 '18 at 00:01

2 Answers2

2

As @KamilCuk said in a comment, this is happening because head -10 only reads the first 10 lines from the pipeline (plus maybe some input buffering), and then closes it; if the input is big enough, this happens before sed has written everything into the pipe (and the pipe's buffer isn't big enough to absorb the extra). So whether this happens or not depends on the input size, OS and its parameters (which determine the pipe's characteristics), sed's behavior on having its output dropped, etc. Just changing things up a bit may be enough to avoid the problem, for example:

...sort -nr | tr -d '"' | head -10       # use `tr` instead of `sed` -- it may behave differently
...sort -nr | head -10 | sed 's/"//g'    # swap `head` and `sed` -- now `sort`'s output is dropped

And here's one that will avoid the error:

...sort -nr | sed '11,$ d; s/"//g'

The way this works is it tells sed to discard lines 11 through the end of input ("$"), but since it discards them after reading them (rather than never reading them in the first place, like head -10), sort's entire output gets read and no error occurs.

BTW, as @triplee pointed out, using cat at the beginning of the pipeline is useless; you should have awk read the file directly, like this:

awk '{print $1, $6}' input.txt | ...
Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151
2

Approach #1: Move sed to the end

cat input.txt | awk '{print $1, $6}' | sort -n | uniq -c | sort -nr | head -10 | sed 's/"//g'

This is semantically the same. By putting sed at the end, you get the formatting you desire, but you'll avoid the error message.

Approach #2: Ignore the error messages.

cat input.txt | awk '{print $1, $6}' | sort -n | uniq -c | sort -nr | sed 's/"//g' 2>/dev/null | head -10

This is rather brute force and may result in you missing another issue in the future.

Mark
  • 4,249
  • 1
  • 18
  • 27