can sed replace words in pattern substring match in one line?

Question

original line in file sed.txt:

outer_string_PATTERN_string(PATTERN_And_PATTERN_PATTERN_i)PATTERN_outer_string(i_PATTERN_inner)_outer_string

only need to replace PATTERN to pattern which in brackets, not lowercase, it could replace to other word.

expect result:

outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string

I could use ([^)]*) pattern to find the substring which would be replace some worlds in. But I can't use this pattern to index the substring's position, and it will replace the whole line's PATTERN to pattern.

:/tmp$ sed 's/([^)]*)/---/g' sed.txt 
outer_string_PATTERN_string---PATTERN_outer_string---_outer_string

:/tmp$ sed '/([^)]*)/s/PATTERN/pattern/g' sed.txt 
outer_string_pattern_string(pattern_And_pattern_pattern_i)pattern_outer_string(i_pattern_inner)_outer_string

I also tried to use the regex group in sed to capture and replace the words, but I can't figure out the command.

Can sed implement that? And how to achieve that? THX.

I don't understand why there are someone voted down this question, so weird. I solved it by myself and in the correctly method. And thanks for https://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed — Victor Lee, Sep 17 '21 at 13:55

anubhava · Answer 1 · 2021-09-17T09:53:50.070

0

As an alternative, it is easier to do this in gnu awk with RS that matches (...) substring:

awk -v RS='\\([^)]+)' '{gsub(/PATTERN/, "pattern", RT); ORS=RT} 1' file

outer_string_PATTERN_string(pattern_i_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string

Steps:

RS='\\([^)]+)' captures a (...) string as record separator
gsub function then replaces PATTERN with pattern in matched text i.e. RT
ORS=RT sets ORS as the new modified RT
1 prints each record to stdout

Another alternative solution using lookahead assertion in a perl regex:

perl -pe 's/PATTERN(?=[^()]*\))/pattern/g' file

edited Sep 17 '21 at 09:53

answered Sep 17 '21 at 08:29

anubhava

761,203
64
569
643

thx, there are some other methods like awk to solve this question, I just try to use sed. – Victor Lee Sep 18 '21 at 06:59
You can use sed but sed is not suitable for problems like this. Just see how simple, maintainable and efficient are these 2 solutions as compared to sed – anubhava Sep 18 '21 at 07:48

score 0 · Answer 2 · answered Sep 17 '21 at 09:20

Can sed implement that?

Yes. But you do not want to do it in sed. Use other programming language, like Python, Perl, or awk.

how to achieve that?

Implementing non-greedy regex is not simple in sed. Basically, generally, it consists of:

taking chunk of the input
process the chunk
put it in hold space
shuffle hold with pattern space - extract what been already processed, what's not
repeat
shuffle with hold space
output

Anyway, the following script:

#!/bin/bash
sed <<<'outer_string_PATTERN_string(PATTERN_i_PATTERN_PATTERN_i)PATTERN_outer_string(i_PATTERN_inner)_outer_string' '
    :loop;
    /\([^(]*\)\(([^)]*)\)\(.*\)/{
        # Lowercase the second part.
        s//\1\L\2\E\n\3/;
        # Mix with hold space.
        G;
        s/\(.*\)\n\(.*\)\n\(.*\)/\3\1\n\2/;
        # Put processed stuff into hold spcae
        h; s/\n.*//; x;
        # Process the other stuff again.
        s/.*\n//;
        bloop;
    };
    # Is hold space empty?
    x; /^$/!{
        # Pattern space has trailing stuff - add it.
        G; s/\n//;
        # We will print it.
        h;
        # Clear hold space
        s/.*//
    };x;
'

outputs:

PATTERN_outer_string(i_pattern_inner)outer_string_PATTERN_string(pattern_i_pattern_pattern_i)_outer_string

THX, I need some time to digest this. Yes, maybe I do not want to do it in `sed` if this is the only one way to achieve that. — Victor Lee, Sep 17 '21 at 09:31
Is this only suit to the example string ? I have update the question, maybe there is something you didn't consider? — Victor Lee, Sep 17 '21 at 09:50
Sure I didn't. And still, the principle will be the same, just a bit more tokenization and hold space shuffeling. Instead of `s//\1\L\2\E\n\3/;` there has to be - first put `\([^(]*\)\)` to hold space, then remember `\(.*\)` also in hold space, replace `PATTERN` to `pattern` on pattern space, restore `\(.*\)` and conitnue. — KamilCuk, Sep 17 '21 at 10:16
emm,,, I think this is hardcoded, and it doesn't general and wasn't fixed the question. — Victor Lee, Sep 17 '21 at 10:47

score 0 · Answer 3 · answered Sep 17 '21 at 13:59

0

Solved by this:

:/tmp$ sed 's/(/\n(/g' sed.txt | sed 's/)/)\n/g' | sed '/([^)]*)/s/PATTERN/pattern/g' | sed ':a;N;$!ba;s/\n//g'
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string

make pattern () in a new line
find the () lines and replace the PATTERN to pattern
merge multiple lines in one line

thanks for How can I replace a newline (\n) using sed?

answered Sep 17 '21 at 13:59

Victor Lee

2,467
3
19
37

That 4 step process is arguably not as efficient as doing it in a single command – anubhava Sep 17 '21 at 14:36

urznow · Accepted Answer · 2021-09-19T08:48:40.633

0

Can sed implement that?

It can be done using GNU sed and basic regular expressions (BRE):

sed '
s/)/)\n/g
:1
s/\(([^)]*\)PATTERN\([^)]*)\n\)/\1pattern\2/
t1
s/\n//g
' < file

where

1st s inserts a newline after each )
2nd s replaces the last (* is greedy) PATTERN inside ()s with pattern
t loops back if a substitution was made
3rd s strips all inserted newlines

EDIT

2nd substitute command edited according to OP's suggestion since there is no need to match \n inside ().

edited Sep 19 '21 at 08:48

answered Sep 17 '21 at 21:33

urznow

1,576
1
4
13

Maybe in basic regular expressions's group, there is not need '\n', should replace to ')', it will more suited for this question case. – Victor Lee Sep 19 '21 at 07:41
@VictorLee: You're right :) I edited my reply. – urznow Sep 19 '21 at 08:49
THX, One more thing, could I have the one-line command of your implement code? – Victor Lee Sep 19 '21 at 09:02
@VictorLee: `sed -e 's/)/)\n/g' -e ':1' -e 's/\(([^)]*\)PATTERN\([^)]*)\n\)/\1pattern\2/' -e 't1' -e 's/\n//g' < file` – urznow Sep 19 '21 at 09:21

can sed replace words in pattern substring match in one line?

4 Answers4