2

I have this regex that I would like to use with sed. I would like to use sed, since I want to batch process a few thousand files and my editor does not like that

Find: "some_string":"ab[\s\S\n]+"other_string_

Replace: "some_string":"removed text"other_string_

Find basically matches everything between some_string and other_string, including special chars like , ; - or _ and replaces it with a warning that text was removed.

I was thinking about combining the character classes [[:space:]] and [[:alnum:]], which did not work.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
User12547645
  • 6,955
  • 3
  • 38
  • 69

1 Answers1

2

In MacOS FreeBSD sed, you can use

sed -i '' -e '1h;2,$H;$!d;g' -e 's/"some_string":"ab.*"other_string_/"some_string":"removed text"other_string_/g' file

The 1h;2,$H;$!d;g part reads the whole file into memory so that all line breaks are exposed to the regex, and then "some_string":"ab.*"other_string_ matches text from "some_string":"ab till the last occurrence of "other_string_ and replaces with the RHS text.

You need to use -i '' with FreeBSD sed to enforce inline file modification.

By the way, if you decide to use perl, you really can use the -0777 option to enable file slurping with the s modifier (that makes . match any chars including line break chars) and use something like

perl -i -0777 's/"some_string":"\Kab.*(?="other_string_)/removed text/gs' file

Here,

  • "some_string":" - matches literal text
  • \K - omits the text matched so far from the current match memory buffer
  • ab - matches ab
  • .* - any zero or more chars as many as possible
  • OR .*? - any zero or more chars as few as possible
  • (?="other_string_) - a positive lookahead (that matches the text but does not append to the match value) making sure there is "other_string_ immediately on the right.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563