3

I need to write a shell script which will input a string in my .c files after a quite complicated pattern to match.

The pattern is: )\n{,

without any tabs/spaces between the \n and the {. That's to say that I want to match with a { located in the first column of my file (this in order to ONLY match with )\n{ following function declarations in C, not the ones following loops or conditions).

void some_function_declaration(char var1, char var2) {

I have read manuals, forums and still cannot figure the right way to write the pattern to match or find the corresponding regex. The corresponding output would be:

void some_function_declaration(char var1, char var2) { time_exe(__func__, cl(clock())); rest of the function...

What follows is what I've came up with so far and doesn't work.

Try 1

sed -i '' '/)$\n{/a time_exe(__func__, cl(clock()));' >> list_func2.c

Try 2

sed -i '' '/)\n{ /a time_exe(__func__, cl(clock()));' >> list_func2.c

Try 3

sed -i '' '/)/\n{/a time_exe(__func__, cl(clock()));' >> list_func2.c

I would be very glad to hear your recommandations about this matter.

aboitier
  • 180
  • 11
  • You can't do this with a simple sed regex pattern because sed is line-oriented -- the file is split on newlines so, *without doing some sed programming*, you'll never see a newline in the sed pattern space. And, unless you're willing to endure it, sed programming can be painful. – glenn jackman Jan 16 '19 at 16:05

4 Answers4

3

You should avoid using sed which is a line-based tool and won't handle this kind of task well.

If you insist on using sed and are using GNU sed, you can however use the -z / --null-data option which will read the whole file in one pass (reading NUL-byte separated records rather than linefeed-separated records) and will enable you to use the )\n{ pattern as you would expect :

$ { echo "line1)"; echo "{line2"; } | sed -z 's/)\n{/X/g'
line1Xline2

As this requires loading the whole file in memory, expect terrible performances for huge files.


If you like unmaintainable gibberish you can solve this problem using sed's less known P, t and D commands :

sed '/)$/{N;s/)\n{/) {\n\ttime_exe(__func__, cl(clock()));/;t;P;D}'

Try it here !

This works by loading an additional line in the pattern space (N) when a line ending in ) is encountered, trying to substitute against the two-lines pattern, and printing (P) and removing (D) the first line from the pattern space if the pattern isn't matched (otherwise t branches to the next iteration), leaving the second line in the pattern space to be used as the first line of the next iteration.

Using /first line pattern/{N;s/whole pattern/replacement/} is often good enough, but it can fail as N will consume a line you won't test the first line pattern against. This is illustrated here.

Aaron
  • 24,009
  • 2
  • 33
  • 57
  • Would you have recommended using awk instead ? The -z is not recognized on my distribution, UNIX Mac OS. Thank you for your insightful and detailed answer. – aboitier Jan 16 '19 at 16:16
  • I don't think `awk` would be great, it's also line-based (or at least record-based) and doesn't help much with regex use. I'd recommend using another regex search/replace tool, I think glenn jackman's suggestion to use `perl` is a good one if available and files aren't huge. Otherwise maybe your text editor or the IDE you use would include a PCRE regex engine (for instance I might use Notepad++ on Windows), and maybe specialized regex tools might optimize for huge files. – Aaron Jan 16 '19 at 16:20
  • If I correctly remember, sed (the good old default sed, not GNU one) has a notion of pattern space and hold space, and command to exchange them. That means that it should be possible to process lines by rolling pairs, at the expense of an unreadable and unmaintainable sed script. I used to commit those kind of things when Python was not available. – Serge Ballesta Jan 16 '19 at 16:22
  • 1
    @SergeBallesta right, and I think using `D` should make it possible without even having to use the hold space. I'll see if I can produce one of those good old lines of worse-than-perl gibberish ;) – Aaron Jan 16 '19 at 16:29
  • @SergeBallesta it's not even as terrible as I thought it'd be ! – Aaron Jan 16 '19 at 16:41
2

I agree with @Aaron, but if you still want exactly sed, look at this:

$  cat /tmp/del.txt
void some_function_declaration(char var1, char var2)
{
  ...
}
void enother_function_declaration(char var1, char var2)
{
  ...
}

And applying sed:

$ cat /tmp/del.txt | sed ':a;N;$!ba;s/)\n{/)\{\ntime_exe(__func__, cl(clock()));/g'
void some_function_declaration(char var1, char var2){
time_exe(__func__, cl(clock()));
  ...
}
void enother_function_declaration(char var1, char var2){
time_exe(__func__, cl(clock()));
  ...
}

I think it looks like you wanted

UPD

Let me explane..

More cross-platform compatible syntax is:

sed -e ':a' -e 'N' -e'$!ba' -e 's/)\n{/)\n{ ... ;/g'

where

  • ':a' - creates a branch label (named a) that can be returned later
  • 'N' - appends next line to current (with \n between)
  • '$!ba' - jumps to label a if next line is the last line
  • 's/)\n{/)\n{ ... ;/g' - makes global substitution in the single line, composed of all the lines and \ns
viktorkho
  • 618
  • 6
  • 19
  • Ah, I thought for a moment I was wrong about the behaviour of `N`, but you're just reimplementing `-z`, reading the whole file before applying the substitution. Still, that's useful if OP is using an implementation that doesn't have `-z` – Aaron Jan 16 '19 at 15:34
  • Honestly, I don't know other solutions besides reading the entire file when I wan to perform new lines with sed.. – viktorkho Jan 16 '19 at 15:39
  • Sometimes you're able to test a first line and consume only the next (few) lines if you match the beginning of the pattern, but as I've explained in the last § of my answer it would probably be a bad idea in this case. I thought you had went for this solution and was surprised to test it successfully until I properly read and understood your `sed` command. – Aaron Jan 16 '19 at 15:42
  • Thank you for your detailed answer. It definitely helped me grasp some more fundamentals about sed. Hope this will also help other people. – aboitier Jan 16 '19 at 16:35
0

This might work for you (GNU sed):

sed ':a;/)$/!b;n;/^{/!ba;c{ time_exe(__func__, cl(clock()));' file

If the current line does not end with ) break out of any further processing by sed. Otherwise, print the current line and read in the next. If that line does not begin with a { check from the beginning, otherwise change it to the desired format.

The desired format may also be appended or inserted, see below:

sed ':a;/)$/!b;n;/^{/!ba;a{ time_exe(__func__, cl(clock()));' file

Or,

sed ':a;/)$/!b;n;s/^{/& time_exe(__func__, cl(clock()));/;Ta' file
potong
  • 55,640
  • 6
  • 51
  • 83
  • This fails when the pattern is preceded by a line that ends in `)`, check [this test](https://ideone.com/4wVL5t) or my answer's last § for a detailed explanation. – Aaron Jan 16 '19 at 16:10
  • Thanks for your answer and precise comments. I should have been more explicit on the fact that using sed was not compulsory because I don't have GNU sed. Also I tried your answer and it didn't work in all cases. Would sometimes add `n...` after my function declaration. Hope this will help other people though. – aboitier Jan 16 '19 at 16:34
  • @Aaron you are right, added a check for such a condition which redoes the initial match. Thanks – potong Jan 17 '19 at 10:59
  • @aboitier In these cases the solutions provided do not depend on GNU sed. When I provide an answer, I use the sed that is present on my machine and so as to avoid potential misunderstandings (especially when answers have to be ameliorated because of after thoughts) I document it with `GNU sed`. – potong Jan 17 '19 at 11:04
0

Perl can be good for this:

$ cat file.c
void some_function_declaration(char var1, char var2)
{
    if (a)
    { b; }
}
void func2()
{
    //
}
$ perl -0777 -pe 's {\)\n\{} {$& some other stuff;\n}g' file.c
void some_function_declaration(char var1, char var2)
{ some other stuff;

    if (a)
    { b; }
}
void func2()
{ some other stuff;

    //
}

Same caveat's as Aaron's answer due to reading the whole file into memory.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352