0

I am fresher in writing perl scripts,so i am asking this as question or support on this, below is the code

start pattern1

line1
Matching pattern can be here
line2
Matching Pattern can be here
line3
line4
...
end pattern1
.
start pattern1
line1
line2
start pattern1

start pattern1
line1
Matching pattern can be here
line2
start pattern1

so from perl i need to grep the lines between start pattern1 ... end pattern1, for this i am using awk command to grep

 $cmd = q(awk '/start pattern1/,end pattern1 /' x.file );
 $n1 = system($cmd);

For this output works fine,Below is the output,

start pattern1
line1
**Matching pattern can be here**
line2
**Matching Pattern can be here**
...
end pattern1

But in the files i have 1000 of lines like this, so i need to grep those lines which have Matching pattern. i.e i need to grep only those starting pattern lines to ending pattern lines has matching pattern

For this i tried

 $cmd = q(awk '/start pattern1/,end pattern1 /' x.file | grep  '$n2\|line4');
 $n1 = system($cmd);

But when i use the above command i don't see any output Here $n2 contains some pattern which is grepped from another file.

if i use direct matched patternin place of $n2 it works fine, why cant i use $n2 here?

Note:i am using this in perl script

From the Awk command i get all the lines between start pattern1...end pattern1,But i have 1000 of such prints, so i need the bunch of the lines of start pattern1 to end pattern1 of thos which get matched with the matching pattern

The expected output when i do is,

start pattern1
line1
Matching pattern can be here
line2
Matching Pattern can be here
line3
line4
...
end pattern1 

 start pattern1
    line1
    Matching pattern can be here
    line2
    start pattern1
RCV
  • 17
  • 2
  • could you please mention clear expected output along with method of getting it and let us know then? – RavinderSingh13 Nov 30 '18 at 12:04
  • From the Awk command i get all the lines between start pattern1...end pattern1,But i have 1000 of such prints, so i need the bunch of the lines of start pattern1 to end pattern1 of thos which get matched with the matching pattern Ex: my file has like the below linesstart pattern1 line1 Matching pattern can be here line2 Matching Pattern can be here line3 line4 ... end pattern1 . start pattern1 line1 line2 start pattern1 start pattern1 line1 Matching pattern can be here line2 start pattern1 – RCV Nov 30 '18 at 12:09
  • 5
    If you're trying to do this from within a perl script then there is no need to involve awk or grep. Just use built-in perl functionality. – Tom Fenech Nov 30 '18 at 12:11
  • 1
    @RamCharan, comments are NOT meant for explanation of question, please update your question with proper details and let us know then. – RavinderSingh13 Nov 30 '18 at 12:18
  • @RavinderSingh sorry for that i have updated this with expected details. Thanks for bringing up that. – RCV Nov 30 '18 at 12:21
  • @Tom Fenech can you help me how to call the builr in function] – RCV Nov 30 '18 at 12:22
  • 2
    The answer to the question you asked is that `q()` doesn't interpolate variables, but this whole code is terrible and should be thrown away. – melpomene Nov 30 '18 at 12:32
  • but from those grepping it makes my work easier and reduces a lot of time @melpomene – RCV Nov 30 '18 at 12:38
  • 1
    @RamCharan if you need to update a perl script, you need to learn perl. Calling system() from any language just because you don't know the syntax is bad practice. – Corentin Limier Nov 30 '18 at 13:01
  • 5
    awk | grep is an anti-pattern. Invoking awk inside perl is an anti-pattern. Invoking awk piped to grep inside perl is anti-matter and may lead to the destruction of the known universe. – William Pursell Nov 30 '18 at 13:47
  • It's amazing how often some version of this question pops up. I think I've near-answered it a few times, lol. Search the site, man. :) [Here](https://stackoverflow.com/questions/34976613/search-for-text-between-two-patterns-with-multiple-lines-in-between)'s the first hit I found. Check the manuals for [`sed`](https://www.gnu.org/software/sed/manual/sed.html) & [`awk`](https://www.gnu.org/software/gawk/manual/html_node/index.html#SEC_Contents) and definitely for [`perl`](https://perldoc.perl.org/index-tutorials.html). – Paul Hodges Nov 30 '18 at 14:27
  • "Invoking awk piped to grep inside perl is anti-matter and may lead to the destruction of the known universe." - [William Pursell](https://stackoverflow.com/users/140750/william-pursell), you are my new favorite person. :) – Paul Hodges Nov 30 '18 at 14:28
  • "Invoking awk piped to grep inside perl..." - **William Pursell**, and wrap all into a Python script for good measure. – karakfa Nov 30 '18 at 16:17
  • @PaulHodges Wow! That's high praise. Thank you. – William Pursell Nov 30 '18 at 19:07

1 Answers1

2

No need to summon awk from within perl since perl is much more powerful than awk.

It's not clear to me whether you want every line between start pattern1 and end pattern1 if there is at least one match inside, or just the matching lines.

If every line between start and end if match:

my @blocks = join("",<>)=~/start pattern1\s*(.*?)end pattern1/gsi;
print grep /matching pattern/i, @blocks;

If every line INCLUDING start|end pattern1:

my @blocks = join("",<>)=~/(start pattern1.*?end pattern1\s*)/gsi;

If just lines with /matching pattern/ between start and end:

print grep { /start pattern1/i../end pattern1/i and /matching pattern/i } <>;

Put that inside a file program.pl and run:

perl program.pl inputfile > outputfile

Some explanation might be needed: join("",<>) returns the whole inputfile as one multi-line string. The /gsi modifiers means: g matches globally so that the @block array will contain what is matched by the parentheses, one array element for each match (without g the @block array would just get the first block of lines), s means that . also matches newline characters which it otherwise wouldn't and i matches by ignoring case (sees no difference between a-z and A-Z letters). The question mark in .*? means no-greedy matching of every character, that is, match until the next end pattern1 and not the last. The <> returns the lines of inputfile (args after perl program.pl) as an array of strings. The .. is the flip-flop operator which is true after the left side becomes true and false after the right side becomes true and stays false until the left side is true again and so on.

Kjetil S.
  • 3,468
  • 20
  • 22
  • Thanks Kjetil for the info that was so helpful, and also 1 more query is instead of passing input file from command line can we program that in the above code ?? – RCV Dec 20 '18 at 07:17
  • @RCV → Yes, just add `open my $fh, "filename.txt" or die;` as the first line. And then replace `<>` with `<$fh>`. (*fh* is short for filehandle) – Kjetil S. Dec 27 '18 at 22:25