Use sed to take all lines containing regex and append to end of file

Question

I'm trying to come up with a sed script to take all lines containing a pattern and move them to the end of the output. This is an exercise in learning hold vs pattern space and I'm struggling to come up with it (though I feel close).

I'm here:

$ echo -e "hi\nfoo1\nbar\nsomething\nfoo2\nyo" | sed -E '/foo/H; //d; $G'
hi
bar
something
yo

foo1
foo2

But I want the output to be:

hi
bar
something
yo
foo1
foo2

I understand why this is happening. It is because the first time we find foo the hold space is empty so the H appends \n to the blank hold space and then the first foo, which I suppose is fine. But then the $G does it again, namely another append which appends \n plus what is in the hold space to the pattern space.

I tried a final delete command with /^$/d but that didn't remove the blank line (I think this is because this pattern is being matched not against the last line, but against the, now, multiline pattern space which has a \n\n in it.

I'm sure the sed gurus have a fix for me.

Why don't you post your solution as an answer? That's actually encouraged here. — Benjamin W., Feb 10 '16 at 06:56

potong · Answer 1 · 2016-02-10T08:53:33.007

1

This might work for you (GNU sed):

sed '/foo/H;//!p;$!d;x;//s/.//p;d' file

If the line contains the required string append it to the hold space (HS) otherwise print it as normal. If it is not the last line delete it otherwise swap the HS for the pattern space (PS). If the required string(s) is now in the PS (what was the HS); since all such patterns were appended, the first character will be a newline, delete the first character and print. Delete whatever is left.

An alternative, using the -n flag:

sed -n '/foo/H;//!p;$!b;x;//s/.//p' file

N.B. When the d or b (without a parameter) command is performed no further sed commands are, a new line is read into the PS and the sed script begins with the first command i.e. the sed commands do not resume following the previous d command.

edited Feb 10 '16 at 08:53

answered Feb 10 '16 at 08:34

potong

55,640
6
51
83

Thanks! What does the //s/.//p do? The way I see it is s/.//p which says match a single character (newline in this case) and replace it with nothing and print. But what are the leading // for? – jshort Feb 10 '16 at 17:09
Hmmm, can you use the 'last regex matched' trick with s? As in // is effectively /foo/s/.//p if the PS matches foo, then do this replacement? – jshort Feb 10 '16 at 17:11

Ed Morton · Answer 2 · 2016-02-11T00:27:26.127

Why? Stuff like this is absolutely trivial in awk, awk is available everywhere that sed is, and the resulting awk script will be simpler, more portable, faster and better in almost every other way than a sed script to do the same task. All that hold space stuff was necessary in sed before the mid-1970s when awk was invented but there's absolutely no use for it now other than as a mental exercise.

$ echo -e "hi\nfoo1\nbar\nsomething\nfoo2\nyo" |
    awk '/foo/{buf = buf $0 RS;next} {print} END{printf "%s",buf}'
hi
bar
something
yo
foo1
foo2

The above will work as-is in every awk on every UNIX installation and I bet you can figure out how it works very easily.

score 0 · Answer 3 · edited May 23 '17 at 12:23

This feels like a hack and I think it should be possible to handle this situation more gracefully. The following works on GNU sed:

echo -e "hi\nfoo1\nbar\nsomething\nfoo2\nyo" | sed -r '/foo/{H;d;}; $G; s/\n\n/\n/g'

However, on OSX/BSD sed, results in this odd output:

hi
bar
something
yonfoo1
foo2

Note the 2 consecutive newlines was replaced with the literal character n

The OSX/BSD vs GNU sed is explained in this article. And the following works (in GNU SED as well):

echo -e "hi\nfoo1\nbar\nsomething\nfoo2\nyo" | sed '/foo/{H;d;}; $G; s/\n\n/\'$'\n''/'

TL;DR; in BSD sed, it does not accept escaped characters in the RHS of the replacement expression and so you either have to put a true LF/newline in there at the command line, or do the above where you split the sed script string where you need the newline on the RHS and put a dollar sign in front of '\n' so the shell interprets it as a line feed.

Use sed to take all lines containing regex and append to end of file

3 Answers3