0

apologies for the rookie question.

I have a small journal where I am letting users comment on stuff that I post and am converting certain characters to smileys.

so :) becomes an image <img src='\smiley\smile.png' /> and :d becomes <img src='\smiley\big-smile.png' /> and so on and so forth.

Now, recently, one of my friends posted an educational link which had a :d in the url and my smiley regex jumped at the link and broke it into pieces, with a big smile image.

You get the Idea.

So I changed my regex from :d to \b:d\b and expected it to match whole word, if :d is all by itself. Guess what? the regex picks up NOTHING now.

here is a sample demonstration of what I am talking about

How do I get the regex to match only :d on its own? thanks.

LocustHorde
  • 6,361
  • 16
  • 65
  • 94

2 Answers2

3

That's because \b matches word boundaries. It works when you put it behind the :d, because the d is considered a word. : is not considered a word character, and thus is not a word boundary. Fix it with a lookbehind for whitespace or an anchor:

(?<=^|\s):d\b

Edit: as Bob Vale pointed out, this also applies if you are matching a smiley like :/, / does not trigger a word boundary. You have to do the same thing, but with a lookahead:

(?<=^|\s):d(?=$|\s)
Håvard
  • 9,900
  • 1
  • 41
  • 46
1

You will need to use look behind and look ahead matches on beginning / end of string and whitespace as the characters you are trying to match won't necessarily trigger the usual word boundary rules.

use(?<=^|\s):d(?=$|\s) this pattern should work for all of your matches eg (?<=^|\s):\)(?=$|\s)

Bob Vale
  • 18,094
  • 1
  • 42
  • 49