How can I express this regex with sed?

Question

I have this regex that I would like to use with sed. I would like to use sed, since I want to batch process a few thousand files and my editor does not like that

Find: "some_string":"ab[\s\S\n]+"other_string_

Replace: "some_string":"removed text"other_string_

Find basically matches everything between some_string and other_string, including special chars like , ; - or _ and replaces it with a warning that text was removed.

I was thinking about combining the character classes [[:space:]] and [[:alnum:]], which did not work.

See [Sed: replacing newlines with “-z”?](https://stackoverflow.com/questions/52538158/sed-replacing-newlines-with-z) — Wiktor Stribiżew, Feb 18 '21 at 11:23
I get an error saying that z is an illegal option. Does this not work on Mac? — User12547645, Feb 18 '21 at 11:23
So, you have a FreeBSD sed, try `sed -e '1h;2,$H;$!d;g' -e 's/"some_string":"ab.*"other_string_/"some_string":"removed text"other_string_/g' file`. Is there any more text on the *same line* after `other_string_`? — Wiktor Stribiżew, Feb 18 '21 at 11:26
Try: `perl -i -0777 's/(?s)("some_string":)"ab.+"(other_string_)/$1"removed text"$2/g' file` — anubhava, Feb 18 '21 at 11:30
@WiktorStribiżew works with `sed -e -I '' ...` Could you post an answer so I can give you an upvote for it? — User12547645, Feb 18 '21 at 11:32

Wiktor Stribiżew · Accepted Answer · 2021-02-18T11:38:15.960

In MacOS FreeBSD sed, you can use

sed -i '' -e '1h;2,$H;$!d;g' -e 's/"some_string":"ab.*"other_string_/"some_string":"removed text"other_string_/g' file

The 1h;2,$H;$!d;g part reads the whole file into memory so that all line breaks are exposed to the regex, and then "some_string":"ab.*"other_string_ matches text from "some_string":"ab till the last occurrence of "other_string_ and replaces with the RHS text.

You need to use -i '' with FreeBSD sed to enforce inline file modification.

By the way, if you decide to use perl, you really can use the -0777 option to enable file slurping with the s modifier (that makes . match any chars including line break chars) and use something like

perl -i -0777 's/"some_string":"\Kab.*(?="other_string_)/removed text/gs' file

Here,

"some_string":" - matches literal text
\K - omits the text matched so far from the current match memory buffer
ab - matches ab
.* - any zero or more chars as many as possible
OR .*? - any zero or more chars as few as possible
(?="other_string_) - a positive lookahead (that matches the text but does not append to the match value) making sure there is "other_string_ immediately on the right.

How can I express this regex with sed?

1 Answers1