1

I am trying to copy to a text file any 'if' blocks from a C++ code base where the expression matches a certain pattern. Is this possible using some combination of grep/awk/sed etc?

Example

If I have files that contain code like:

//File1.cpp
if(/*matching-expression-1*/)
{
    //Code
}

//File2.cpp
if(/*non-matching-expression*/)
{
    if(/*matching-expression-2*/)
    {
        //Code
    }
}

//File3.cpp
if((/*matching-expression-3*/)
{
    if(/*non-matching-expression*/)
    {
        //Code
    }
}

I would like to get a result like:

//OutputFile.txt

File1.cpp:
if(/*matching-expression-1*/)
{
    //Code
}

File2.cpp:
if(/*matching-expression-2*/)
{
    //Code
}

File3.cpp:
if((/*matching-expression-3*/)
{
    if(/*non-matching-expression*/)
    {
        //Code
    }
}

I'm okay with the //Code block containing other matching/unmatching if blocks, even if that leads to repeat entries, and it's not necessary for the tab indent to be preserved.

I have no trouble using grep to match the expressions I want, but that only gives me the lines containing the start of the 'if' block (which is a good start!) but I am unsure how to proceed.

Any help at all would be appreciated!

AMackie
  • 27
  • 1
  • 6

4 Answers4

2

Assuming all your code is formatted the same as your question and there are no stray brackets( say in strings or something) then this should work

perl -ne 'if(/if\(STRING\)/){$_.=<>;$b+=/{/g;}if($b > 0){print;$b+=/{/g;$b-=/}/g}' file

Replace string with whatever you want to search for.

123
  • 10,778
  • 2
  • 22
  • 45
  • This did it! Not familiar with perl so I'll have to parse this properly, but I think I see what it's doing. I suspected I'd have to track each { and } until I found the closing brace. Is there a straightforward way to also print the filename once at the start of each match `if` expression? – AMackie Apr 04 '17 at 14:01
  • @AMackie Yeah it basically just adds or subtracts from a counter if a bracket is encountered and prints if the counter is above 0. Does changing `print` to `print "$ARGV:$_"` do what you want? – 123 Apr 04 '17 at 14:10
1

In awk:

$ awk '/\*matching-expression/{f=1}f{c+=sub(/{/,"{");if(sub(/}/,"}") && --c==0)f=0;print $0}' file
if(/*matching-expression-1*/)
{
    //Code
}
    if(/*matching-expression-2*/)
    {
        //Code
    }
if((/*matching-expression-3*/)
{
    if(/*non-matching-expression*/)
    {
        //Code
    }
}

Explained:

/\*matching-expression/ { f=1 }   # flag up at match
f {                               # when flag is up
    c+=sub(/{/,"{")               # { increments counter
    if(sub(/}/,"}") && --c==0)    # if count is about be 0
        f=0                       # flag down
    print $0                      # print when flag is up
}

It expects that each { and } are on their own lines. Well, there can be other stuff on that line but only one { or }. Oh yeah and @123's no stray brackets applies here too, that would require parsing quotes around brackets, I assume. Probably still doable, I recon.

James Brown
  • 36,089
  • 7
  • 43
  • 59
  • could you please explain this? – Ashish K Apr 04 '17 at 13:27
  • This won't work, you never decrement c(well only if it is 1, which it won't be for embedded ifs). Any embedded if's will cause it to write the rest of the file out. – 123 Apr 04 '17 at 13:33
  • The `sub` was originally `c-=sub`. Back to the drawing table. BRB. Make that BBIW, I seem to have a meeting... – James Brown Apr 04 '17 at 13:35
  • 1
    Missing a semicolon after f=0, but yeah looks like it should – 123 Apr 04 '17 at 13:46
  • This worked as well as @123 's answer. I'm inclined to mark this as the accepted answer since you broke it down so well and I did mention awk specifically in the question. – AMackie Apr 04 '17 at 14:20
0

With a for loop and sed the following would work:

for var in $(ls *.cpp);do echo -e $var":";sed -n '/\*matching-expression/,/}/p' $var;echo -e "\n";done > outputfile

This would take each file and then add a ":" to the file name and then with sed, show the section of the code from the matching expression to the FIRST } outputting the result into outputfile.

The only issue with this would be that it may miss off closing brace brackets

In order to overcome this, you could add:

left=$(cat outputfile | grep "{" | wc -l)
right=$(cat outputfile | grep "}" | wc -l)
diff=$(echo $(($left-$right)))
varb="";for ((i=0;i<$diff;i++));do varb=$varb"}";done
echo $varb >> outputfile

Here we count the number of left brace brackets and place it in the variable left, count the number of right brace brackets and place it in right and then finally place the difference between the two in the variable diff This diff variable is then used to form a variable (varb) with the necessary additional brace brackets. This variable is finally added to outputfile to complete the necessary syntax.

-2

Look at this answer, it might help you along.

For example (without using Perl regex):

grep -zo "if\\s*(condition)\\s*{[^}]*}" File1.cpp
Community
  • 1
  • 1
ivanhoe
  • 99
  • 1
  • 5