How to extract lines after founding specific string

Question

My example text is,

AA BB  CC
DDD
process.get('name1')
process.get('name2')
process.get('name3')
process.get('name4')
process.get('name5')
process.get('name6')
EEE
FFF
...

I want to search the string "process.get('name1')" first, if found then extract the lines from "process.get('name1')" to "process.get('name6')".

How do I extract the lines using sed?

mauro · Answer 1 · 2016-01-19T14:31:03.877

3

This should work and... it uses sed as per OP request:

$ sed -n "/^process\.get('name1')$/,/^process\.get('name6')$/p" file

edited Jan 19 '16 at 14:31

answered Jan 19 '16 at 14:22

mauro

5,730
2
26
25

score 2 · Answer 2 · edited May 23 '17 at 12:15

sed is for simple substitutions on individual lines, for anything more interesting you should be using awk:

$ awk -v beg="process.get('name1')" -v end="process.get('name6')" \
    'index($0,beg){f=1} f; index($0,end){f=0}' file
process.get('name1')
process.get('name2')
process.get('name3')
process.get('name4')
process.get('name5')
process.get('name6')

Note that you could use a range in awk, just like you are forced to in sed:

awk -v beg="process.get('name1')" -v end="process.get('name6')" \
        'index($0,beg),index($0,end)' file

and you could use regexps after escaping metachars in awk, just like you are forced to in sed:

awk "/process\.get\('name1'\)/,/process\.get\('name6'\)/" file

but the first awk version above using strings instead of regexps and a flag variable is simpler (in as much as you don't have to figure out which chars are/aren't RE metacharacters), more robust and more easily extensible in future.

It's important to note that sed CANNOT operate on strings, just regexps, so when you say "I want to search for a string" you should stop trying to force sed to behave as if it can do that.

Imagine your search strings are passed in to a script as positional parameters $1 and $2. With awk you'd just init the awk variables from them in the expected way:

awk -v beg="$1" -v end="$2" 'index($0,beg){f=1} f; index($0,end){f=0}' file

whereas with sed you'd have to do something like:

beg=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$1")
end=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$2")
sed -n "/^${beg}$/,/^${end}$/p" file

to deactivate any metacharacters present. See Is it possible to escape regex metacharacters reliably with sed for details on escaping RE metachars for sed.

Finally - as mentioned above you COULD use a range expression with strings in awk:

awk -v beg="$1" -v end="$2" 'index($0,beg),index($0,end)' file

but I personally have never found that useful, there's always some slight requirements change comes along to make me wish I'd started out using a flag. See Is a /start/,/end/ range expression ever useful in awk? for details on that

because index() operates on strings instead of regexps so you don't need to escape RE metacharacters to use it so it's simpler and far more robust for a case like this where the OP clearly wants to just search for literal strings. — Ed Morton, Jan 19 '16 at 14:28
@Ed Morton Wouldn't this do? `awk '/name1/ {f=1}; f; /name6/ {f=0}' file` — user2138595, Jan 19 '16 at 22:05
@user2138595 No. In addition to what I said above about the OP just wanting to use strings, not regexps, imagine what your example would do if `name123` or `surname654` existed in the file. — Ed Morton, Jan 19 '16 at 22:22

How to extract lines after founding specific string

2 Answers2