7

I was using the standard \b word boundary. However, it doesn't quite deal with the dot (.) character the way I want it to.

So the following regex:

\b(\w+)\b

will match cats and dogs in cats.dog if I have a string that says cats and dogs don't make cats.dogs.

I need a word boundary alternative that will match a whole word only if:

  1. it does not contain the dot(.) character
  2. it is encapsulated by at least one space( ) character on each side

Any ideas?!

P.S. I need this for PHP

hippietrail
  • 15,848
  • 18
  • 99
  • 158
ObiHill
  • 11,448
  • 20
  • 86
  • 135

2 Answers2

6

You could try using (?<=\s) before and (?=\s) after in place of the \b to ensure that there is a space before and after it, however you might want to also allow for the possibility of being at the start or end of the string with (?<=\s|^) and (?=\s|$)

This will automatically exclude "words" with a . in them, but it would also exclude a word at the end of a sentence since there is no space between it and the full stop.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592
  • Thanks. Any way I could include words at the beginning and end of a sentence as well?! I might not need it, but it might just be good to know. – ObiHill Dec 28 '12 at 23:26
2

What you are trying to match can be done easily with array and string functions.

$parts = explode(' ', $str);
$res = array_filter($parts, function($e){
   return $e!=="" && strpos($e,".")===false;
});

I recommend this method as it saves time. Otherwise wasting few hours to find a good regex solution is quite unproductive.

Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187