Match a string that contains a newline using sed

Question

I have a string like this one:

    #
    pap

which basically translates to a \t#\n\tpap and I want to replace it with:

    #
    pap
    python

which translates to \t#\n\tpap\n\tpython.

Tried this with sed in a lot of ways but it's not working maybe because sed uses new lines in a different way. I tried with:

sed -i "s/\t#\n\tpap/\t#\tpython\n\tpap/" /etc/freeradius/sites-available/default

...and many different other ways with no result. Any idea how can I do my replace in this situation?

sed is an excellent tool for simple substitution son a single line. It is NOT to be used on any problem that involves matching REs across multiple lines. The sed language constructs for that became obsolete in the mid-1970s when awk was invented. — Ed Morton, May 25 '14 at 13:27

score 12 · Accepted Answer · answered May 25 '14 at 00:23

12

try this line with gawk:

awk -v RS="\0" -v ORS="" '{gsub(/\t#\n\tpap/,"yourNEwString")}7' file

if you want to let sed handle new lines, you have to read the whole file first:

sed ':a;N;$!ba;s/\t#\n\tpap/NewString/g' file

answered May 25 '14 at 00:23

Kent

189,393
32
233
301

1

Here's sed without reading the whole file first: `sed -e ':b; /^\t#$/ { N; s/\n\tpap$/&\n\tpython/; te; P; D; }; :e'` – that other guy May 25 '14 at 00:30
2

+` for the awk solution. I'm ignoring the sed one so I can +1 the awk one :-). BTW I recently discovered that there are people parsing text files that contain `NUL` chars and so using `RS='\0'` doesn't work for them so I've switched to by default using `RS='^$'` and stating it's gawk-only. `^$` works because those 2 chars match the start and end of a string, and gawk treats an input file as a string to be split into records so `RS='^$'` is only true for an empty string/file and cannot exist in a file with any content. If they can't get gawk, then `RS='\0'` is next with the caveat I mentioned. – Ed Morton May 25 '14 at 13:31
2

What is the `7` for after the end of the `gsub`? – Liron Yahdav Jun 08 '18 at 01:37
1

@LironYahdav `print` – Kent Jun 08 '18 at 08:48
Unfortunately no explation to why this works better. – trapicki Jul 17 '18 at 17:53
@trapicki don't get you, don't understand the downvote either – Kent Jul 18 '18 at 07:56
The answer seems to work, but I do not understand why this works, what it does in detail. An explanation of your solution would increase my knowledge. This way I can only copy the solution and still have the problem with the next challenge. – trapicki Jul 23 '18 at 12:17
1

@trapicki OP has written sed codes in his/her question, I assume that OP can understand what I wrote in my answer. No matter awk or sed, you need to know what basic command means, like N, s/.../../ , ba etc. those explanations are in sed's man page, same situation for awk. I don't have to repeat the thing in man page here once again. – Kent Aug 21 '18 at 08:10

score 6 · Answer 2 · answered May 25 '14 at 08:45

6

This might work for you (GNU sed):

sed '/^\t#$/{n;/^\tpap$/{p;s//\tpython/}}' file

If a line contains only \t# print it, then if the next line contains only \tpap print it too, then replace that line with \tpython and print that.

answered May 25 '14 at 08:45

potong

55,640
6
51
83

+1 - clever. Was puzzled by the `p` at first, given that `n` normally prints the newly loaded line, until I realized that your `s` command effectively the deletes the line by not referring to it in the replacement string; in other words: `p;s//\typthon/` is the equivalent of: `s//&\n\tpython/`. – mklement0 May 25 '14 at 15:06

mklement0 · Answer 3 · 2014-05-25T15:44:38.327

A GNU sed solution that doesn't require reading the entire file at once:

sed '/^\t#$/ {n;/^\tpap$/a\\tpython'$'\n''}' file

/^\t#$/ matches comment-only lines (matching \t# exactly), in which case (only) the entire {...} expression is executed:
- n loads and prints the next line.
- /^\tpap/ matches that next line against \tpap exactly.
- in case of a match, a\\tpython will then output \n\tpython before the following line is read - note that the spliced-in newline ($'\n') is required to signal the end of the text passed to the a command (you can alternatively use multiple -e options).

(As an aside: with BSD sed (OS X), it gets cumbersome, because

Control chars. such as \n and \t aren't directly supported and must be spliced in as ANSI C-quoted literals.
Leading whitespace is invariably stripped from the text argument to the a command, so a substitution approach must be used: s//&\'$'\n\t'python'/ replaces the pap line with itself plus the line to append:
```
sed '/^'$'\t''#$/ {n; /^'$'\t''pap$/ s//&\'$'\n\t'python'/;}' file
```

)

An awk solution (POSIX-compliant) that also doesn't require reading the entire file at once:

awk '{print} /^\t#$/ {f=1;next} f && /^\tpap$/ {print "\tpython"} {f=0}' file

{print}: prints every input line
/^\t#$/ {f=1;next}: sets flag f (for 'found') to 1 if a comment-only line (matching \t# exactly) is found and moves on to the next line.
f && /^\tpap$/ {print "\tpython"}: if a line is preceded by a comment line and matches \tpap exactly, outputs extra line \tpython.
{f=0}: resets the flag that indicates a comment-only line.

mklement0 · Answer 4 · 2014-05-25T15:11:08.090

A couple of pure bash solutions:

Concise, but somewhat fragile, using parameter expansion:

in=$'\t#\n\tpap\n' # input string

echo "${in/$'\t#\n\tpap\n'/$'\t#\n\tpap\n\tpython\n'}"

Parameter expansion only supports patterns (wildcard expressions) as search strings, which limits the matching abilities:
Here the assumption is made that pap is followed by \n, whereas no assumption is made about what precedes \t#, potentially resulting in false positives.
If the assumption could be made that \t#\n\tpap is always enclosed in \n, echo "${in/$'\n\t#\n\tpap\n'/$'\n\t#\n\tpap\n\tpython\n'}" would work robustly; otherwise, see below.

Robust, but verbose, using the `=~` operator for regex matching:

The =~ operator supports extended regular expressions on the right-hand side and thus allows more flexible and robust matching:

in=$'\t#\n\tpap' # input string 

# Search string and string to append after.
search=$'\t#\n\tpap'
append=$'\n\tpython'

out=$in # Initialize output string to input string.
if [[ $in =~ ^(.*$'\n')?("$search")($'\n'.*)?$ ]]; then # perform regex matching
    out=${out/$search/$search$append} # replace match with match + appendage
fi

echo "$out"

score 0 · Answer 5 · answered Oct 21 '22 at 15:18

You can just translate the character \n to another one, then apply sed, then apply the reverse translation. If tr is used, it must be a 1-byte character, for instance \v (vertical tabulation, nowadays almost unused).

cat FILE|tr '\n' '\v'|sed 's/\t#\v\tpap/&\v\tpython/'|tr '\v' '\n'|sponge FILE

or, without sponge:

cat FILE|tr '\n' '\v'|sed 's/\t#\v\tpap/&\v\tpython/'|tr '\v' '\n' >FILE.bak && mv FILE.bak FILE

Match a string that contains a newline using sed

5 Answers5

Concise, but somewhat fragile, using parameter expansion:

Robust, but verbose, using the `=~` operator for regex matching:

Linked

Match a string that contains a newline using sed

5 Answers5

Concise, but somewhat fragile, using parameter expansion:

Robust, but verbose, using the =~ operator for regex matching:

Linked

Robust, but verbose, using the `=~` operator for regex matching: