3

I have the following text in my file

---BEGIN TEXT---
any text1
anytext2
anytext3
---END TEXT---
---BEGIN TEXT---
any text4
any text5
---END TEXT---

I want to remove the 2nd text block from "---BEGIN TEXT---" to "---END TEXT---"

How I can make that with a linux command

So my file will contains only:

---BEGIN TEXT---
any text1
anytext2
anytext3
---END TEXT---

I know how to remove the 1st block with the following command:

sed -n '/BEGIN TEXT/,/END TEXT/{p;/PAT2/q}' file.txt

How I can modify my sed comand to remove the 2nd part and not the first part? or use another command like awk ?

myradio
  • 1,703
  • 1
  • 15
  • 25
MOHAMED
  • 41,599
  • 58
  • 163
  • 268
  • if there are no further blocks, you can reverse the input file, remove first block and then reverse again... awk would be better suited here – Sundeep Jul 02 '18 at 14:22
  • 1
    Can there be text before, after and in the middle of these 2 blocks? – anubhava Jul 02 '18 at 14:25

6 Answers6

4

Here's a modified sample for a generic solution

$ cat ip.txt 
foobaz
---BEGIN TEXT---
block 1
any text
---END TEXT---
1234567
---BEGIN TEXT---
block 2
any text
---END TEXT---
helloworld
---BEGIN TEXT---
block 3
any text
---END TEXT---
42424242

To remove only the second block:

$ awk -v b=2 '/BEGIN TEXT/{f=1; c++} !(f && c==b); /END TEXT/{f=0}' ip.txt 
foobaz
---BEGIN TEXT---
block 1
any text
---END TEXT---
1234567
helloworld
---BEGIN TEXT---
block 3
any text
---END TEXT---
42424242
  • -v b=2 the block to be removed
  • /BEGIN TEXT/{f=1; c++} set flag and increment counter when starting regex is matched
  • /END TEXT/{f=0} clear flag for ending regex
  • !(f && c==b) don't print input record if flag is set and it is the block specified by b variable


Further reading:

Sundeep
  • 23,246
  • 2
  • 28
  • 103
4

With GNU awk for multi-char RS and RT:

$ awk -v RS='---END TEXT---\n' '{ORS=RT} NR==1' file
---BEGIN TEXT---
any text1
anytext2
anytext3
---END TEXT---

$ awk -v RS='---END TEXT---\n' '{ORS=RT} NR!=1' file
---BEGIN TEXT---
any text4
any text5
---END TEXT---

$ awk -v RS='---END TEXT---\n' '{ORS=RT} NR==2' file
---BEGIN TEXT---
any text4
any text5
---END TEXT---
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
3

Instead of sed you can use awk:

awk '/BEGIN TEXT/{found++} found==1{print $0}' yourfile

awk processes files line by line. So here we test to see if the current line has the BEGIN TEXT in it. If it does, we bump the found variable by 1. In the next block we print the line print $0 if the found variable is equal to 1.

If the file is large and we want to stop processing after found is greater than 1 we can add an additional block to exit

awk '/BEGIN TEXT/{found++} found==1{print $0} found>1{exit 0}' yourfile
JNevill
  • 46,980
  • 4
  • 38
  • 63
2

Using GNU awk and multi-line record:

awk -v RS='---END TEXT---' 'NR==1{print $0 RT}' file

RS is the record separator, set to the end of the block.

NR is the number of record. In this case we only want the first one.

RT is the record terminator that store the record seperator of the current record, and is printed together with the wanted block.

oliv
  • 12,690
  • 25
  • 45
0

simplistically, sed -i '/---END TEXT---/q;' txtfile though I don't think that's actually the answer you wanted.

You could write a more complex sed script that does a lot of hold and pattern space manipulation, but meh.

If what you specifically wanted was to exclude the second set, here's a way using just bash. Not better than the awk answers, but I like to contribute variety.

c=0; while read line; do [[ "$line" = "---BEGIN TEXT---" ]] && (( c++ )); (( c != 2 )) && echo "$line"; done <txt

or, formatted -

c=0
while read line
do [[ "$line" = "---BEGIN TEXT---" ]] && (( c++ ))
   (( c != 2 )) && echo "$line"
done < txtfile
Paul Hodges
  • 13,382
  • 1
  • 17
  • 36
0

This might work for you (GNU sed):

sed -r '/---BEGIN/{:a;N;/^---END/M!ba;x;s/^/x/;/^x{2}$/{x;d};x}' file

Gather up the lines between ---BEGIN and ---END and then increment a counter in the hold space (HS). If the counter is 2 delete the collection otherwise print as normal.

potong
  • 55,640
  • 6
  • 51
  • 83