1

I have this regex that works fine enough for my purposes for identifying emails in CSVs within a directory using grep on Mac OS X:

grep --no-filename -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" *

I've tried to get this working with sed so that I can replace the emails with foo@bar.baz:

sed -E -i '' -- 's/\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b/foo@bar.baz/g' *

However, I can't seem to get it to work. Admittedly, sed and regex are not my strong points. Any ideas?

foobar0100
  • 291
  • 1
  • 3
  • 9

2 Answers2

0

The sed in OSX is broken. Replace it with GNU sed using Homebrew that will be used as a replacement for the one bundled in OSX. Use this command for installation

sudo brew install gnu-sed

and use this for substitution

sed -E -i 's/\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b/foo@bar.baz/g' *

Reference

rock321987
  • 10,942
  • 1
  • 30
  • 43
  • 2
    "Broken" how? The example you link to shows somebody using incorrect syntax and then concluding that his sed is "broken". – Benjamin W. Apr 03 '16 at 05:33
  • @BenjaminW. I googled for `unterminated substitute in regular expression` and found that link..The same `sed` is working fine on `ubuntu` – rock321987 Apr 03 '16 at 05:37
  • 1
    It's not because it's "broken", it's because they use different dialects of the `sed` scripting language. *BSD (and thus OSX) is closer to the original and to POSIX, whereas GNU `sed` has a large number of nonstandard extensions. Not supporting those extensions is not "broken"; if anything, a script which requires those extensions is. – tripleee Apr 03 '16 at 06:09
  • @tripleee feel free to edit my answer. I don't know anything about `OSX sed`.. I thought it as standard error – rock321987 Apr 03 '16 at 06:12
0

You seem to assume that grep and sed support the same regex dialect, but that is not necessarily, or even usually, the case.

If you want a portable solution, you could easily use Perl for this, which however supports yet another regex dialect...

perl -i -p -e 's/\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b/foo@bar.baz/g' *

For a bit of an overview of regex dialects, see https://stackoverflow.com/a/11857890/874188

Your regex kind of sucks, but I understand that is sort of beside the point here.

Community
  • 1
  • 1
tripleee
  • 175,061
  • 34
  • 275
  • 318