2

I would like to get multi-line text in between horizontal delimiter and ignore anything else before and after the delimiter.

An example would be:-

Some text here before any delimiter
----------
Line 1
Line 2
Line 3
Line 4
----------
Line 1
Line 2
Line 3
Line 4
----------
Some text here after last delimiter

And I would like to get

Line 1
Line 2
Line 3
Line 4


Line 1
Line 2
Line 3
Line 4

How do I do this with awk / sed with regex? Thanks.

John Doe
  • 649
  • 2
  • 8
  • 17
  • Surely you don't mean `sed '/^-*$/d'` – Dennis Williamson Jul 07 '12 at 05:06
  • Can you clarify "before and after the delimiter"? – Ray Toal Jul 07 '12 at 05:07
  • Delete up to the first delimiter, and from the last delimiter to the end, and turn any delimiters in between into a random amount of newlines? – tripleee Jul 07 '12 at 06:51
  • @RayToal I have edited my example. Basically I have text at the start and end of the output file which I don't really need. I just need the output in between the delimiters. Thanks. – John Doe Jul 07 '12 at 07:15
  • I see, but this can't be done in a pure line-oriented awk/sed style script because you don't know when you have reached the _last_ delimiter until you have read the whole file. Are you able to slurp the whole file into memory? Because this is trivial to write in Python or Ruby if so. – Ray Toal Jul 07 '12 at 07:41
  • Does this answer your question? [Sed to extract text between two strings](https://stackoverflow.com/questions/16643288/sed-to-extract-text-between-two-strings) – tripleee Feb 09 '21 at 08:20

3 Answers3

3

You can try this.

file: a.awk:

BEGIN { RS = "-+" } 

{
    if ( NR > 1 && RT != "" )
    {
        print $0      
    }
}

run: awk -f a.awk data_file

nick
  • 643
  • 1
  • 5
  • 11
0

If you can comfortably fit the entire file into memory, and if Perl is acceptable instead of awk or sed,

perl -0777 -pe 's/\A.*?\n-{10}\n//s;
    s/(.*\n)-{10}\n.*?\Z/\1/s;
    s/\n-{10}\n/\n\n\n/g' file >newfile

The main FAQs here are the -0777 option (slurp mode) and the /s (dot matches newlines) regex flag.

tripleee
  • 175,061
  • 34
  • 275
  • 318
0

This might work for you:

sed '1,/^--*$/d;:a;$!{/\(^\|\n\)--*$/!N;//!ba;s///p};d' file
potong
  • 55,640
  • 6
  • 51
  • 83