145

It looks like the new version of OS X no longer supports grep -P and as such has made some of my scripts stop working, for example:

var1=`grep -o -P '(?<=<st:italic>).*(?=</italic>)' file.txt`

I need to capture grep's result to a variable and I need to use zero-width assertions, as well as \K:

var2=`grep -P -o '(property:)\K.*\d+(?=end)' file.txt`

Any alternatives would be greatly appreciated.

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
kugyousha
  • 2,472
  • 3
  • 21
  • 22
  • 8
    how about installing gnu grep? – Kent May 20 '13 at 21:06
  • Are you sure it's the `-P`? Mine has it. – Kevin May 20 '13 at 21:20
  • 6
    @Kevin It was removed in 10.8. – Lri May 21 '13 at 17:08
  • @LauriRanta I have 10.8... Interestingly, it's still in the usage but actually using it doesn't work – Kevin May 21 '13 at 17:43
  • Cannot install anything on these machines unfortunately. – kugyousha May 22 '13 at 14:24
  • @Kent care to elaborate on how one might do that? – drevicko Mar 28 '14 at 04:13
  • [It really seems to have been removed](http://www.dirtdon.com/?p=1452), what a dick move of Apple if this happened intentionally. – Adrian Frühwirth Mar 28 '14 at 08:47
  • 11
    @AdrianFrühwirth OS X's `grep` actually changed from `grep (GNU grep) 2.5.1` in 10.7 to `grep (BSD grep) 2.5.1-FreeBSD` in 10.8. I guess it was because of GPL. The FreeBSD `grep` is also based on GNU `grep` and both versions of `grep` are from 2002. `--label` and `-u` / `--unix-byte-offets` were also removed in 10.8. `-z` / `--decompress`, `-J` / `--bz2decompress`, `--exclude-dir`, `--include-dir`, `-S`, `-O`, and `-p` were added in 10.8. `-Z` changed from `--null` to `--decompress`. – Lri Apr 03 '14 at 13:41
  • @LauriRanta Thanks for the info, that explains it...much appreciated. I don't have an OS X/*BSD installation handy but read that `BSD grep` is way slower than `GNU grep`, can you confirm if this is still the case on 10.8 (compared to `GNU grep` installed via homebrew, for example)? I'm just curious. – Adrian Frühwirth Apr 03 '14 at 13:58
  • 3
    The FreeBSD `grep` that comes with OS X is from 2002, and https://wiki.freebsd.org/BSDgrep still says that "the only TODO item is improving performance", so yeah. `time grep aa /usr/share/dict/words>/dev/null` takes about 0.09 seconds with OS X's grep and about 0.01 seconds with a new GNU grep on repeated runs on my iMac. – Lri Apr 03 '14 at 17:17

13 Answers13

141

If your scripts are for your use only, you can install grep from homebrew-core using brew:

brew install grep 

Then it's available as ggrep (GNU grep). it doesn't replaces the system grep (you need to put the installed grep before the system one on the PATH).

The version installed by brew includes the -P option, so you don't need to change your scripts.

If you need to use these commands with their normal names, you can add a "gnubin" directory to your PATH from your bashrc like:

PATH="/usr/local/opt/grep/libexec/gnubin:$PATH"

You can export this line on your ~/.bashrc or ~/.zshrc to keep it for new sessions.

Please see here for a discussion of the pro-s and cons of the old --with-default-names option and it's (recent) removal.

drevicko
  • 14,382
  • 15
  • 75
  • 97
  • 3
    @pepper what didn't work? Likely the path isn't set properly - what's the output of `which grep`? Should be `/usr/local/bin/grep`. It;s a bit mean to downvote before you've checked carefully that there is a problem! – drevicko May 07 '14 at 02:28
  • indeed that is it! But it did not put the installed grep before the system one on the PATH as you said. I'm happy to upvote, I really do not understand the fuss about the point system on this website (can't you just hack around that?). Glad you pointed this out though, I'm setting up an alias for grep ASAP! – pepper May 07 '14 at 03:40
  • 2
    probably better to add `/usr/local/bin` to the front of your PATH. Brew is supposed to set that up I believe? Did you use `--default-names`? Anyway, glad it works (: Not sure about hacking around it, but I think the point system is one of the reasons this site is such a good resource. – drevicko May 07 '14 at 04:23
  • 1
    yes I did use --default-names and brew. Not sure if putting /usr/local/bin in the front of your path is better than an alias, just an alternative – pepper May 07 '14 at 17:01
  • great answer, the only way that makes work. would be great if you can add how to setup an alias to your answer. – He Hui Mar 18 '15 at 04:20
  • 12
    an alternative to `--with-default-names` is to add `alias grep='ggrep'` to your bash profile and let brew dupes keep their prefix – rymo Sep 01 '15 at 19:37
  • 4
    `--with-default-names` is removed from brew. I had to `brew install grep` to get ggrep and then do as @rymo says and do `alias grep='ggrep'` . – Henge Jul 26 '19 at 11:23
  • 1
    maybe if your script runs sub-shell an alias would not work. Need to `brew install grep` and then add to your path, or `export PATH="/usr/local/opt/grep/libexec/gnubin:$PATH"` – NicoKowe Nov 13 '19 at 15:18
85

If you want to do the minimal amount of work, change

grep -P 'PATTERN' file.txt

to

perl -nle'print if m{PATTERN}' file.txt

and change

grep -o -P 'PATTERN' file.txt

to

perl -nle'print $& while m{PATTERN}g' file.txt

So you get:

var1=`perl -nle'print $& while m{(?<=<st:italic>).*(?=</italic>)}g' file.txt`
var2=`perl -nle'print $& while m{(property:)\K.*\d+(?=end)}g' file.txt`

In your specific case, you can achieve simpler code with extra work.

var1=`perl -nle'print for m{<st:italic>(.*)</italic>}g' file.txt`
var2=`perl -nle'print for /property:(.*\d+)end/g' file.txt`
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • 1
    This works great but it returns all matches as where the grep I used only returned the first match. any idea about how to return just the first match? – kugyousha May 22 '13 at 14:21
  • 1
    @ironintention: add `| tail -1` to the end of the pipeline. – Peter Dec 10 '13 at 21:34
  • `grep` always returns all matching lines (unless you use one of the options where it prints none at all). Anyway, `if (/.../) { print $1; last; }` will cause it to only print the first match. – ikegami Dec 11 '13 at 01:40
  • I used this to get out the urls of a sitemap - thanks mate, would not have made it without your post! perl -nle'print $1 if m{(.*)}' sitemap.xml – Christian Dec 23 '13 at 21:08
  • 2
    @Christian, Would only take 3 lines to do it with a proper XML parser such as XML::LibXML. (Key line: `say $_->textContent for $doc->findnodes('//loc');`) – ikegami Dec 23 '13 at 23:32
  • @Ikegami I needed this only one time for a specific use case. The result be trashed in the way. I am happy with it right now. Anyway thanks for letting me know about libxml. There are times i regret my lack of perl-fu. – Christian Dec 24 '13 at 09:42
  • Adjusted to handle multiple matches per line like `grep -o` – ikegami Dec 24 '18 at 23:54
13

Install ack and use it instead. Ack is a grep replacement written in Perl. It has full support for Perl regular expressions.

Michael Carman
  • 30,628
  • 10
  • 74
  • 122
  • I'd like to check this out but this is for work computers so we cannot install anything – kugyousha May 22 '13 at 14:23
  • @ironintention: If you can install Perl modules, you're good. Even if you can't add to the local Perl installation you can always use local::lib. – Michael Carman May 22 '13 at 18:58
  • `ack` is designed to be self-contained; you don't need to actually install it. If you can save a file, mark it as exectutable, and update your `PATH` if necessary, you are good to go. – tripleee Mar 02 '14 at 08:23
  • Can you please the ack syntax that replaces the above – William Entriken Jun 23 '16 at 14:24
  • @FullDecent: It's almost identical: `ack -o '(property:)\K.*\d+(?=end)' file.txt` (`-o` means the same thing, but you don't need the `-P` with ack) – Michael Carman Jun 24 '16 at 14:43
11

OS X tends to provide BSD rather than GNU tools. It does come with egrep however, which is probably all you need to perform regex searches.

example: egrep 'fo+b?r' foobarbaz.txt

A snippet from the OSX grep man page:

grep is used for simple patterns and basic regular expressions (BREs); egrep can handle extended regular expressions (EREs).

nebulous
  • 738
  • 6
  • 17
  • 9
    Direct invocation as egrep is deprecated. The same ability is also available as grep -E. It's... a sad shadow of Perl, lacking lookaround assertions, most of the backslash escapes, options, conditionals, etc :( Power users will hate it, but it does at least do the job. – Dewi Morgan Oct 11 '16 at 16:59
  • 2
    Thanks. `grep -E` instead of `grep -P` was exactly what I needed. – asmaier Sep 25 '19 at 13:08
8

use perl;

perl -ne 'print if /regex/' files ...

If you need more grep options (I see you would like -o at least) there are various pgrep implementations floating around the net, many of them in Perl.

If "almost Perl" is good enough, PCRE ships with pcregrep.

tripleee
  • 175,061
  • 34
  • 275
  • 318
6

There is another alternative: pcregrep.

Pcregrep is a grep with Perl-compatible regular expressions. It has the exactly same usage as grep -P. So it will be compatible with your scripts.

It can be installed with homebrew:

brew install pcre

Daniel Baird
  • 2,239
  • 1
  • 18
  • 24
Gabor Marton
  • 2,039
  • 2
  • 22
  • 33
4

How about using the '-E' option? It works fine for me, for example, if I want to check for a php_zip, php_xml, php_gd2 extension from php -m I use:

php -m | grep -E '(zip|xml|gd2)'
zx485
  • 28,498
  • 28
  • 50
  • 59
ZenC
  • 57
  • 4
  • 1
    this works. Mac uses FreeBSD grep and Linux uses GNU grep...so this fix worked on my macOS sierra – jimh Jun 21 '17 at 19:15
3

Equivalent of the accepted answer, but without the requirement of the -P switch, which was not present on both machines I had available.

find . -type f -exec perl -nle 'print $& if m{\r\n}' {} ';' -exec perl -pi -e 's/\r\n/\n/g' {} '+'
marknuzz
  • 2,847
  • 1
  • 26
  • 29
2

This one worked for me:

    awk  -F":" '/PATTERN/' file.txt
petegam
  • 21
  • 1
0

Another Perl solution for -P

var1=$( perl -ne 'print $1 if m#<st:italic>([^<]+)</st:italic># ' file.txt)
Rory Hunter
  • 3,425
  • 1
  • 14
  • 16
0

use the perl one-liner regex by passing the find output with a pipe. I used lookbehind (get src links in html) and lookahead for " and passed the output of curl (html) to it.

bash-3.2# curl stackoverflow.com | perl -0777 -ne '$a=1;while(m/(?<=src\=\")(.*)(?=\")/g){print "Match #".$a." "."$&\n";$a+=1;}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  239k  100  239k    0     0  1911k      0 --:--:-- --:--:-- --:--:-- 1919k
Match #1 //ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js
Match #2 //cdn.sstatic.net/Js/stub.en.js?v=fb6157e02696
Match #3 https://ssum-sec.casalemedia.com/usermatch?s=183712&amp;cb=https%3A%2F%2Fengine.adzerk.net%2Fudb%2F22%2Fsync%2Fi.gif%3FpartnerId%3D1%26userId%3D
Match #4 //i.stack.imgur.com/817gJ.png" height="16" width="18" alt="" class="sponsor-tag-img">elasticsearch</a> <a href="/questions/tagged/elasticsearch-2.0" class="post-tag" title="show questions tagged &#39;elasticsearch-2.0&#39;" rel="tag">elasticsearch-2.0</a> <a href="/questions/tagged/elasticsearch-dsl" class="post-tag" title="show questions tagged &#39;elasticsearch-dsl&#39;" rel="tag
Match #5 //i.stack.imgur.com/817gJ.png" height="16" width="18" alt="" class="sponsor-tag-img">elasticsearch</a> <a href="/questions/tagged/sharding" class="post-tag" title="show questions tagged &#39;sharding&#39;" rel="tag">sharding</a> <a href="/questions/tagged/master" class="post-tag" title="show questions tagged &#39;master&#39;" rel="tag
Match #6 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/linux" class="post-tag" title="show questions tagged &#39;linux&#39;" rel="tag">linux</a> <a href="/questions/tagged/camera" class="post-tag" title="show questions tagged &#39;camera&#39;" rel="tag
Match #7 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/firebase" class="post-tag" title="show questions tagged &#39;firebase&#39;" rel="tag"><img src="//i.stack.imgur.com/5d55j.png" height="16" width="18" alt="" class="sponsor-tag-img">firebase</a> <a href="/questions/tagged/firebase-authentication" class="post-tag" title="show questions tagged &#39;firebase-authentication&#39;" rel="tag
Match #8 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/ios" class="post-tag" title="show questions tagged &#39;ios&#39;" rel="tag">ios</a> <a href="/questions/tagged/in-app-purchase" class="post-tag" title="show questions tagged &#39;in-app-purchase&#39;" rel="tag">in-app-purchase</a> <a href="/questions/tagged/piracy-protection" class="post-tag" title="show questions tagged &#39;piracy-protection&#39;" rel="tag
Match #9 //i.stack.imgur.com/tKsDb.png" height="16" width="18" alt="" class="sponsor-tag-img">android</a> <a href="/questions/tagged/unity3d" class="post-tag" title="show questions tagged &#39;unity3d&#39;" rel="tag">unity3d</a> <a href="/questions/tagged/vr" class="post-tag" title="show questions tagged &#39;vr&#39;" rel="tag
Match #10 http://pixel.quantserve.com/pixel/p-c1rF4kxgLUzNc.gif" alt="" class="dno
bash-3.2# date
Mon Oct 24 20:57:11 EDT 2016
0

I had this same problem with grep suddenly on a docker rebuilt, I found the solution here : https://github.com/firehol/firehol/issues/325

just replaced -oP with -oE

echo $some_var | grep -oE '\b[0-9a-f]{5,40}\b' | head -1

Jijo John
  • 1,368
  • 2
  • 17
  • 31
-1

Some more options, these also set correct exit status:

  • equivalent to grep -P PATTERN FILE :

    perl -e'while(<>){if( (m!PATTERN!) ){$ok++;print}};if(!($ok)){exit 1}' FILE

  • equivalent to grep -P -i PATTERN FILE :

    perl -e'while(<>){if( (m!PATTERN!i) ){$ok++;print}};if(!($ok)){exit 1}' FILE

  • equivalent to grep -v -P PATTERN FILE :

    perl -e'while(<>){if( !(m!PATTERN!) ){$ok++;print}};if(!($ok)){exit 1}' FILE

For a more cleaner solution use this gist - implemented switches are: -A , -B , -v , -P , -i : https://gist.github.com/torson/bd6931bda0035c4884b2a8c4c64a33b2

torson
  • 1
  • 1
  • Probably lose the [useless uses of `cat`](https://stackoverflow.com/questions/11710552/useless-use-of-cat) – tripleee Dec 27 '22 at 13:06