1

I would like to insert multiple of lines from a text file before a particular text. I would like to use regex to select the particular text, and the text is like this:

//**insert_yannyann*//

『//**insert_yannyann*//』is in b.txt, and b.txt is just like that

...

//**insert_yannyann*//

...

a.txt is like that:

1234
5678
9101

For inserting a.txt text file before the pattern of text in b.txt , I tried this regex in ubuntu 18.04 bash command.

sed -n -i -e '\/\/**insert_yannyann*\/\/ /r a.txt' -e 1x -e '2,${x;p}' -e '${x;p}' b.txt

even I tried another regex pattern.

sed -n -i -e '//?\s*\*[(?=.*\insert_yannyann\b)]*?\*\s*//? /r a.txt' -e 1x -e '2,${x;p}' -e '${x;p}' b.txt

but sed always show wrong message to me for the wrong regex I used.

I want to make b.txt be like that:

...

1234
5678
9101
//**insert_yannyann*//

...

I am certainly check two of these regex is correct by some online regex tools, but I don't understand why sed show wrong message to me.

\/\/**insert_yannyann*\/\/

//?\s*\*[(?=.*\insert_yannyann\b)]*?\*\s*//?

I'm not sure whether regex regulation is the same in different programming languages, could somebody explain why it is not correct?

YannYann
  • 99
  • 1
  • 10
  • Possible duplicate of [using sed to insert file content into a file BEFORE a pattern](https://stackoverflow.com/questions/26141347/using-sed-to-insert-file-content-into-a-file-before-a-pattern) – Sundeep Dec 11 '18 at 13:25
  • 1
    regarding `online regex tools` they are not suitable for `sed` because the regular expression feature and syntax is very different.. for ex: lookaround is not supported by sed – Sundeep Dec 11 '18 at 13:26
  • Oh~, I see, so I have better to try another method to approach the same effect? – YannYann Dec 11 '18 at 13:28
  • 2
    there are various ways suggested in the duplicate question mentioned above, I feel `sed -e '\,//\*\*insert_yannyann\*//, { r a.txt' -e 'N}'` is good choice if you know the line to be matched cannot be last line in the file – Sundeep Dec 11 '18 at 13:28
  • uhnmmmm, OK. But I found it is better to use {} to separate file from regex rather than using "/xxx/xxx/", is it right? – YannYann Dec 11 '18 at 13:33
  • And also could you please explain about what is the function of <-e 'N> ? – YannYann Dec 11 '18 at 14:07

1 Answers1

1

Perl may not be your option, but it's worth a try. With Perl you can say:

perl -0777 -ne 'if ($. == 1) {$replace = $_; next} s#(?=//\*\*insert_yannyann\*//)#$replace#g; print' a.txt b.txt > b_new.txt

Then b_new.txt holds:

...

1234
5678
9101
//**insert_yannyann*//

...

Explanations:

  • -0777 option causes Perl to slurp whole files at once.
  • Perl variable $. holds the input line number which is equivalent to the input file number in this usecase. With this value we can switch the processings for a.txt and b.txt.
  • The $replace = $_ statement assigns the variable $replace to the content of a.txt.
  • The most important part will be the regex s#(?=//\*\*insert_yannyann\*//)#$replace#g. Perl regex supports a lookahead assertion with (?=pattern) notation. Thanks to this capability, we can easily insert a content just before the specified pattern.

Hope this helps.

EDIT

With AWK, you can do the similar thing:

awk 'NR==FNR {replace = replace $0 RS; next}
    {text = text $0 RS}
    END {
        print gensub(/\/\/\*\*insert_yannyann\*\/\//, replace "&", "g", text)
    }' a.txt b.txt > b_new.txt

The point is that the replacement string (the 2nd argument to gensub()) is a concatenation of replace, the content of a.txt, and & which represents the regex-matched string. Putting the variable replace prior to & causes the substitution before the matched pattern.

tshiono
  • 21,248
  • 2
  • 14
  • 22
  • Oh! I see. but I have one more question. is there awk, grep also support looking function or only perl can do that? – YannYann Dec 12 '18 at 05:18
  • 1
    Many other languages (PHP, ruby, python ..) and some tools (grep, find ..) supports the `lookaround in regex`. Although AWK does not support it, we can do the same thing. I've added the example in my answer. As a matter of fact, the `lookahead assertion` capability is not essential for my script above because it slurps all lines *before* applying the pattern matching with regex. In the strict sense, `lookahead` in line-by-line processing would be something like to make a decision depending on the next (unread) line, which my usage does not apply. – tshiono Dec 12 '18 at 06:45
  • Oh! I got it! sorry for my late reply for recently I'm working...It is the safe way to use "the same pattern" method than looking-forward method in some strict conditions. Thank you for teaching me this thing! – YannYann Dec 15 '18 at 03:23
  • Note that `gensub()` is a GNU awk extension. – jarno Jul 09 '20 at 21:55