0

I am trying to use sed to replace a few thousand strings I have.

I have strings like ('app','model.whatever.id') or ('app','model.whatever.whateveragain.status') or ('app','model.whatever.type').

I need to replace all instances of these strings to be like so:

('app','model.id')
('app','model.status')
('app','model.type')

A few notes. I only need to match strings that start with model. or whatevermodel., the middle can have 1 or multiple chunks, I need to retain the final piece of information ie id,status etc.

The code I currently have is:

find /var/www/html/test2 -type f -print0 | xargs -0 sed -i '/.*model\..*\./{s//model./g}' 

This seems to work for most examples but in the case of ('app','model.whatever.type'). the final fullstop outside of the parenthesis causes an issue as the parenthesis is removed (I have instanced where the fullstop can occur 350 characters later so large chunks of the lines are being removed.

Forgive me as regex is not my strong suit, but I have attempted to use the following, but I am not getting the desired result. This was meant to match the last occurrence of a fullstop before a parenthesis.

find /var/www/html/test2 -type f -print0 | xargs -0 sed -i '/model\..*(?:(?!^.:[ ])[\s\S])*\\)/{s//model./g}'

Can anyone point me in the right direction as I feel I'm a few tweaks away from what i need.

The Humble Rat
  • 4,586
  • 6
  • 39
  • 73
  • 1
    Are you sure your sed implementation supports look-around assertions? – choroba Feb 19 '16 at 11:04
  • @choroba that's a great point. The thought never even occurred to me. Looking here http://stackoverflow.com/questions/12176026/whats-wrong-with-my-lookahead-regex-in-linux-sed it seems you are correct that sed doesn't actually have this capability. Feel free to place this as an answer as it seems I will need to use perl. – The Humble Rat Feb 19 '16 at 11:10

1 Answers1

2

I don't know of any sed implementation that supports look-around assertions.

But it seems you don't need them. I'm getting the expected output with much simpler regex:

sed -e 's/model\.[^'\'']*\./model./'

or

sed -e "s/model\.[^']*\./model./"
sed -E 's/(model\.)[^'\'']*\./\1/'
sed -E "s/(model\.)[^']*\./\1/"

Explanation of the tricky part:

  • [ starts a character class.
  • ^ negates the class.
  • ' ends the single quoted string.
  • \' the literal quote. The shell will remove the backslash.
  • ' starts the quoted string again.
  • ] closes the class.
  • * Zero or more times.

So, it's there only to solve shell quoting. What sed gets it the same as the double quoted string below.

choroba
  • 231,213
  • 25
  • 204
  • 289
  • Even better. Many thanks, I have just tested and works perfectly. I mostly understand what is occurring here, but i'm not 100% on this part `[^'\'']*`, if you have a second to explain that would be great. Thank you again for your quick response it really is much appreciated. – The Humble Rat Feb 19 '16 at 11:21
  • I'll just point out, in case anyone is stuck on the last two, that `-E` is the BSD sed option to use Extended RE, equivalent to GNU sed's `-r` option. Recent FreeBSD versions consider these options synonymous, but you'll likely use `-E` in OSX. Oh, and @TheHumbleRat, [this](http://stackoverflow.com/a/9712555/1072112) will help explain the quotes. :) – ghoti Feb 19 '16 at 11:23
  • @TheHumbleRat: explained. – choroba Feb 19 '16 at 11:29
  • @choroba Thanks again for your help and detailed information. Certainly teaching a man to fish. – The Humble Rat Feb 19 '16 at 11:44