2

I'm having some troubles understanding how to use grep to achieve an apparently simple task. I wanna match a substring that appears in a lot of files that I have but I wanna ignore the cases when this substring is preceded by a letter or a number

For example I have a bunch of files with lines like:

{ some word: ['bar-something', 'bar-somthing-else'] },
{ some text: ['bar-fab', 'bar-fab-foo', 'bar-eggs'] },
<bar-sometext>Hello World!</bar-sometext>
'bar-foobar-foo'
'bar-foo'

and I wanna replace all the bar- appearances for ket- but only if bar isn't preceded by a letter or number, for example

'bar-foobar-foo'

should be changed to

'ket-foobar-foo'

but I'm having some troubles because the grep command is not being consistent with their own rules

let me explain:

The command:

git grep -l 'bar-' | xargs sed -i '' -e 's/bar-/ket-/g' 

almost work, the only problem is that it's also changing the bar that is preceded by letters or numbers:

'bar-foobar-foo' to 'ket-fooket-foo'

To do some tests, before make the replacements I'm only matching with grep. I was expecting that the command

grep -E '[^a-zA-Z0-9]ket-' a.file 

did the trick, but it's also matching any special character preceding the word ket-. For example, is matching

<bar-

'bar-

\bar-

(I remove the rest of the text for simplicity, the above is highlighted as the matched text) instead of only matching bar-. Why is doing that?, when I wasn't excluding letters or numbers, grep wasn't matching these preceding special characters.

How can I replace only bar- without matching anything else, but at the same time ignoring any case where this substring is preceded by any letter or number. My expected output for the example that I gave is:

{ some word: ['ket-something', 'ket-somthing-else'] },
{ some text: ['ket-fab', 'ket-fab-foo', 'ket-eggs'] },
<ket-sometext>Hello World!</ket-sometext>
'ket-foobar-foo'
'ket-foo'

BTW I'm using a mac and I having troubles to do the replacements, the command

git grep -l 'bar-' | xargs sed -i '' -e 's/bar-/ket-/g'

works pretty well in my Mac with oh-my-zsh terminal, I will appreciate any answer that closely look like the above command

Thanks in Advance

RobertoH
  • 21
  • 2
  • 1
    You can do "sed" command two times to fix this and you don't need grep `sed -i -e "s/'bar-/'ket-/g" ` `sed -i -e "s/` – Karthikeyan Sep 26 '20 at 09:45
  • Hi @Karthikeyan, that almost works (for this short example it only requires manage the case for the backslash), but I assume that I need to add a new line to treat each special case, for example a command like `sed -i -e "s/\bar-/\ket-/g" `. Now I need to figure out how to to apply it to all my files because they are thousands. The wildcard * is outputting this error message: `sed: in-place editing only works for regular files`. But thanks, your answer surely is an excellent starting point to build one answer that works for my intended case – RobertoH Sep 26 '20 at 10:19

2 Answers2

2

Perhaps, you should use another instrument, that supports lookbehind assertions.

perl -pi.bak -e 's/(?<![\p{L}\d])bar/test/g' file.txt
  • -p processes, then prints <> line by line,
  • -i activates in-place editing. file.txt will be backed up with the .bak extension,
  • -e means that the first argument is Perl one-liner, not a Perl executable file,
  • (?<! is a negative lookbehind assertion,
  • \p{L} is any letter.

Inspired by https://stackoverflow.com/a/6995010/6632736.

Alexander Mashin
  • 3,892
  • 1
  • 9
  • 15
  • Note it is more efficient to join two single-char matching patterns into a character class, `(?<!\p{L}|\d)` => `(?<![\p{L}\d])`, to avoid extra backtracking at the start. – Wiktor Stribiżew Sep 26 '20 at 13:28
1

With GNU sed:

sed 's/\([^[:alnum:]]\)bar/\1ket/g' file

This is a sed substitution in the form of 's/pattern/replace/g' where g means globally.

The matching pattern means: one non-alphanumeric character followed by "bar". The replacement is the character matched (\1) followed by ket. Whatever is nested between parentheses in the matching pattern, can be re-used into the replacement like \1 \2 and so on, until \9.

You can do it inplace, like in your example command (and with any macOS specific adjustments). Also, grep is not being used for replacements, it only extracts text, and there is usually no reason to use it along with awk or sed.

thanasisp
  • 5,855
  • 3
  • 14
  • 31
  • Hi @thanasisp I used grep to make the replacements in all my files inside directories and subdirectories recursively. Because I have thousands of files I need a way to apply the changes to all my files. Your solution for some reason is not matching the case where a line starts with – RobertoH Sep 27 '20 at 07:40
  • You can do this recursively for all the files you want. `find` would do it, or a glob file argument. For example: `find ... | xargs sed -i ...` – thanasisp Sep 27 '20 at 07:43
  • Also I see it matches lines starting with `-`, if you have an example line that didn't match correctly, please update with this example. – thanasisp Sep 27 '20 at 07:58
  • Sorry, yesterday I edited my comment with the right example but apparently it wasn't updated. The cases that are not being matched are: `bar-somthing` `bar-another-thing` that is, if 'bar' is starting the word, the above solution is skipping those cases. Also I would like to match the cases where the the substring `bar` is preceded by a blank space for example, I would like to replace: `someTex moretext barSomeWord` with `someTex moretext ketSomeWord` – RobertoH Sep 27 '20 at 19:06
  • Sorry I have no time now to follow any changes in this question, good luck and have a nice day! – thanasisp Sep 27 '20 at 19:28