Delete text between 2 delimiters with AWK

Question

I have a file like this:

start of my file
Some lines
#start
other lines
blabla other lines
#end
end of my file

I would like to delete everything between #start and #end (#start and #end included) and export the result to a file.

Expected result :

start of my file
Some lines
end of my file

I know how to make a selection between delimiters

awk '/#start/,/#end/'

but I can't do the deletion.

edit : I don't agree with the closure of the question. The link given contains an answer with sed, which is not the purpose of my question since I am asking to do it with awk. Even if the result is the same, it is not an answer to my question.

what is the expected output if there is no `#end` line? or is every `#start` line guaranteed to have a matching `#end` line? can a `#start/#end` pair be embedded within another `#start/#end` pair and if so should the deletions be based on the `inner` or `outer` pair? can 2 sets of `#begin/#end` pairs overlap each other and if so how should the lines be processed? are you only interested in lines that ***start*** with `#begin/#end` or should we also process lines where there may be characters before the `#begin/#end` strings? (and if so, do we delete the entire lines or just part of the lines) — markp-fuso, Sep 05 '22 at 17:55
I use this command to generate vhost and delete parts according to their usage. It's a personal thing so there will always be an `#end`. Assuming the number of lines is fixed, I could specify that n lines should be deleted after `#start`. I didn't think I could do this with `sed`, thanks for the solution. — Antoine, Sep 05 '22 at 18:01
`sed '/^#start/,/^#end/d' file` would do the job as shown in dupe link as well. — anubhava, Sep 05 '22 at 18:05
New dupe link provided that covers awk, sed and perl tools for this task. — anubhava, Sep 05 '22 at 18:42

tripleee · Accepted Answer · 2022-09-05T17:42:24.827

2

The expression

awk '/#start/,/#end/'

is shorthand for

awk '/#start/,/#end/ { print $0 }'

If the implied default action is not the one you want, spell out what you do want.

awk '/#start/,/#end/ { next } 1'

says to print all lines (by way of another shorthand, 1, which selects the default action for all lines by virtue of having an address expression which is true for all lines) but skip that for lines in the region (next is the instruction to discard the current input line).

edited Sep 05 '22 at 17:42

answered Sep 05 '22 at 17:34

tripleee

175,061
34
275
318

I get `awk: cmd. line:1: error: `continue' is not allowed outside a loop` with GNU Awk 5.1.0 – Arkadiusz Drabczyk Sep 05 '22 at 17:40
1

Maybe you meant `next`? – Arkadiusz Drabczyk Sep 05 '22 at 17:41
Duh, thanks, too much Python (or rather, probably, too little Awk) lately. – tripleee Sep 05 '22 at 17:42
Using `{next} 1` and `if` works. I select the `next` version as the answer as it is shorter and I think more efficient in terms of performance. – Antoine Sep 05 '22 at 17:53

score 1 · Answer 2 · answered Sep 05 '22 at 17:34

1

I tested this and it works:

awk 'BEGIN { p = 1 } 
     {
      if (/^#start/) { p = 0 } ;
      if (p == 1) { print } ;
      if (/^#end/) { p = 1 }
     }' myfile.txt

answered Sep 05 '22 at 17:34

Bill Karwin

538,548
86
673
828

Using `{next} 1` and `if` works. I select the `next` version as the answer as it is shorter and I think more efficient in terms of performance. – Antoine Sep 05 '22 at 17:54
Sure, whatever solution makes most sense to you, go for it. – Bill Karwin Sep 05 '22 at 17:56

markp-fuso · Answer 3 · 2022-09-05T18:16:57.773

A pair of sed ideas:

sed -n '/#start/,/#end/!p' file
sed    '/#start/,/#end/d'  file

Both generate:

start of my file
Some lines
end of my file

NOTE:

keep in mind that if there is a #start line but no #end line, all of the answers (so far) will delete everything from #start to the end of file
if the requirement is to remove lines only if both #start and #end exist, this can be done but it will require a bit more work

Delete text between 2 delimiters with AWK

3 Answers3