0

I need a general solution, if possible with sed util, to find a multiline block of text. The text is not known in advance and it might contain specific symbols, so I cannot escape symbols. This block must be considered as a raw text.

Then I need to insert into the file, another block of text, which also might contain different specific chars, not known in advance.

Here is an example. Original file which contains several qq§$<>ui lines:

line1
line2
qq§$<>ui
klfd</de>
qq§$<>ui
line gg
qq§$<>ui
line aaa
qq§$<>ui
line bbb
lastButOneLine
lastLine

Text to search:

qq§$<>ui
klfd</de>

Text to add after it:

qq§$<>ui
another2ndLine</de>combination

The result:

line1
line2
qq§$<>ui
klfd</de>
qq§$<>ui
another2ndLine</de>combination
qq§$<>ui
line gg
qq§$<>ui
line aaa
qq§$<>ui
line bbb
lastButOneLine
lastLine
Alexandr
  • 9,213
  • 12
  • 62
  • 102
  • Welcome to Stack Overflow. [SO is a question and answer page for professional and enthusiast programmers](https://stackoverflow.com/tour). Please add your own code to your question. You are expected to show at least the amount of research you have put into solving this question yourself. – Cyrus Sep 20 '20 at 09:05
  • I'm not sure how you get the text, and why does it have to be multiline text? Try checking here:- https://stackoverflow.com/questions/15559359/insert-line-after-first-match-using-sed – Daniel Sep 20 '20 at 09:05

3 Answers3

2

Assume ip.txt is the input file, f1 has the input string to search and f2 has the string to be added.

With perl (worked for given sample, not sure if some other unicode characters can cause issues)

a="$(< f1)" b="$(< f2)" perl -0777 -pe 's/\Q$ENV{a}\E\K/\n$ENV{b}/g' ip.txt

\Q and \E will protect input from being interpreted as regex metacharacters



With GNU sed, assuming input doesn't have ASCII NUL characters.

$ # escape all BRE metacharacters
$ # replace literal newlines with \n
$ sed -z 's#[[^$*.\/]#\\&#g; s/\n/\\n/g' f1
qq§\$<>ui\nklfd<\/de>\n

$ # escape all replacement section metacharacters for f2
$ # and add trailing \ for literal newlines
$ sed 's:[\\/&]:\\&:g;$!s/$/\\/' f2
qq§$<>u$i\
another2ndLine<\/de>combination

With that working, you can then use sed -z again for actual modification:

$ search="$(sed -z 's#[[^$*.\/]#\\&#g; s/\n/\\n/g' f1)"
$ repl="$(sed 's:[\\/&]:\\&:g;$!s/$/\\/' f2)"
$ sed -z 's/'"$search"'/&'"$repl"'\n/g' ip.txt
line1
line2
qq§$<>ui
klfd</de>
qq§$<>u$i
another2ndLine</de>combination
qq§$<>ui
line gg
qq§$<>ui
line aaa
qq§$<>ui
line bbb
lastButOneLine
lastLine


With ripgrep:

rg -N --passthru -UF "$(< f1)" -r '$0'$'\n'"$(sed 's/\$/$$/g' f2)" ip.txt
  • -N to prevent line numbers in output
  • --passthru to allow all input lines to be printed, whether or not they match the search condition
  • -UF enable multiline match and fixed string match
  • "$(< f1)" input string to search, note that trailing newline will be removed
  • -r '$0'$'\n'"$(sed 's/\$/$$/g' f2)" replacement string
    • $0 the string that was matched
    • $'\n' to add trailing newline that was removed earlier
    • "$(sed 's/\$/$$/g' f2)" content of f2 with $ escaped as $$

See my blog post for more details about search and replacment with rg command.

Sundeep
  • 23,246
  • 2
  • 28
  • 103
1

@Sundeep, @stevesliva, thank you for your effort. Both solutions are a bit complex to me and I needed something that is less complicated.

If to consider a solution as a function, than the simplest one should accept only 3 parameters:

  1. file path,
  2. search block of text,
  3. insert block of text.

As a consumer/client I do not want to know HOW to solve, rather only to know WHAT to pass to the solution.

regexp is a very might solution, but if you do not work with them regularly it takes more time to maintain them and use.

I've created a small java application. For my environment, running java is not an issue. Here is how it could be invoked:

java -jar insert-unique-after.jar \
  --path some.txt \
  --insert-after "line1
  line2
    line3" \
  --insert-text "line4
  line5
  line6"

It is simple and clear. And that was my choice.

For those who want to try it, here is a git project: insert-after and a built executable jar file: insert-unique-after.jar

I am sure that it is also very simple to implement it on any modern programming languages, without requirement to have jre installed.

Alexandr
  • 9,213
  • 12
  • 62
  • 102
0

I tend to dislike complicated regexes. It's not simple, but you can pipeline from grep results to construct a simple sed command to do the replacement.

File to search is file, lines to add are in add.txt.

First, find all of the second lines, with prior line and line numbers included in output:

$ grep -nFB1 'klfd</de>' file
3-qq§$<>ui
4:klfd</de>

Second, find only lines where the first line preceded the second line:

$ grep -nFB1 'klfd</de>' file | grep -FA1 -- '-qq§$<>ui'
3-qq§$<>ui
4:klfd</de>

Third, turn that output into a simple sed r command for each matching line:

$ grep -nFB1 'klfd</de>' file | grep -FA1 -- '-qq§$<>ui' | sed -n '/^[0-9]*:/{s/:.*/r add.txt\n/;P}'
4r add.txt

Finally, run the above command on file with sed -f- to take commands from stdin.

$ echo ADD THIS > add.txt

$ grep -nFB1 'klfd</de>' file | grep -FA1 -- '-qq§$<>ui' | sed -n '/^[0-9]*:/{s/:.*/r add.txt\n/;P}' | sed -f- file
line1
line2
qq§$<>ui
klfd</de>
ADD THIS
qq§$<>ui
line gg
qq§$<>ui
line aaa
qq§$<>ui
line bbb
lastButOneLine
lastLine
stevesliva
  • 5,351
  • 1
  • 16
  • 39