print text between two markers with sed if second marker exists

Question

I have a file which contains a list of car manufacturers:

$ cat cars
subaru
mercedes
porche
ferrari
audi
mercedes
BMW
ferrari
toyota
lexus
mercedes
VW
$

I would like to print all the lines between mercedes and ferrari so the desired output is:

mercedes
porche
ferrari
mercedes
BMW
ferrari

My first thought was to use gsed -n '/mercedes/,/ferrari/p' cars, but this does obviously not work because sed processes file line by line and it has no way to know that last mercedes in this file is not followed with ferrari. I was able to accomplish this with gsed -n '/mercedes/h;/^[^mercedes].*$/H;/ferrari/{g;p}' cars, but I see few problems with this solution:

1) if the end-marker is present, but start-marker is not. For example if last mercedes in my file is replaced with ferrari, then output is wrong.

2) one can not use regular expressions in [^mercedes] part. For example if I would like to use both mercedes and mg-motors as a start marker, then I can't use [^m.*s] regular expression as it would match literal characters m, .. * and s.

Is there a smarter way to print text between two markers with sed only if the second marker exists? Should one use awk in order to solve this problem?

sed is for simple substitutions on individual lines. If you need to use more than s, g, and p (with -n) then you are using the wrong tool as all other sed language constructs became obsolete in the mid-1970s when awk was invented. Also, never use range expressions as they make the solutions to trivial problems very slightly briefer but then need a complete rewrite or duplicate conditions when the problem becomes even the slightest bit more interesting. It sounds like your problem is with rainy day cases but your sample input/output is only sunny day - reconsider that! — Ed Morton, Nov 25 '15 at 14:04

score 1 · Answer 1 · edited May 23 '17 at 11:59

You can go through the file twice:

first time to count how many ferrari you have
second time to print those lines after a mercedes and before a ferrari in case there are still some ferrari to appear:

That is:

awk 'FNR==NR{if ($0~/ferrari/) {ferr++}; next}
     /mercedes/{flag=1}
     flag && count<ferr
     /ferrari/{flag=0; count++}' file file

Further explanation in How to select lines between two marker patterns which may occur multiple times with awk/sed.

Test

$ awk 'FNR==NR{if ($0~/ferrari/) {ferr++}; next} /mercedes/{flag=1} flag && count<ferr; /ferrari/{flag=0; count++}' a a
mercedes
porche
ferrari
mercedes
BMW
ferrari

score 1 · Accepted Answer · answered Nov 25 '15 at 12:54

1

awk 'a{a=a"\n"$0}/mercedes/{a=$0}/ferrari/{print a;a=""}' file
mercedes
porche
ferrari
mercedes
BMW
ferrari



sed -n '/mercedes/{:a;N;/ferrari/{p;b};ba}' file
mercedes
porche
ferrari
mercedes
BMW
ferrari

answered Nov 25 '15 at 12:54

bian

1,456
8
7

score 1 · Answer 3 · answered Nov 25 '15 at 20:40

This might work for you (GNU sed):

sed -n '/mercedes/!d;:a;/ferrari/p;//d;N;ba' file

Use seds grep-like command switch -n to print only when requested. Delete all lines other than those that contain mercedes then print the pattern space if it contains ferrari and delete it. Othewise append the next line and test again.

print text between two markers with sed if second marker exists

3 Answers3

Test