0

I need to capture all (english) words except abbreviations whose pattern are:

"_any-word-symbols-including-dash." 

(so there is underscore in the beginning and dot in the end an any letters and dash in the middle)

I tried smthing like this:

/\b([A-Za-z-^]+)\b[^\.]/g

but i seems that I don't understand how to work with negative matches.

UPDATE:

I need not just to match but wrap the words in some tags:

"a some words _abbr-abrr. a here" I should get:

<w>a</w> <w>some</w> <w>words</w> _abbr-abbr. <w>a</w> <w>here</w>

So I need to use replace with correct regex:

test.replace(/correct regex/, '<w>$1</w>')
WHITECOLOR
  • 24,996
  • 37
  • 121
  • 181

1 Answers1

2

Negative lookahead is (?!).

So you can use:

/\b([^_\s]\w*(?!\.))\b/g

Unfortunately, there is no lookbehind in javascript, so you can't do similar trick with "not prefixed by _".

Example:

> a = "a some words _abbr. a here"
> a.replace(/\b([^_\s]\w*(?!\.))\b/g, "<w>$1</w>")
"<w>a</w> <w>some</w> <w>words</w> _abbr. <w>a</w> <w>here</w>"

Following your comment with -. Updated regex is:

/\b([^_\s\-][\w\-]*(?!\.))\b/g

> "abc _abc-abc. abc".replace(/\b([^_\s\-][\w\-]*(?!\.))\b/g, "<w>$1</w>")
"<w>abc</w> _abc-abc. <w>abc</w>"
Community
  • 1
  • 1
mishik
  • 9,973
  • 9
  • 45
  • 67
  • Thanks it seems to work beter, the problem i can not resolve with abbreviations that contain dash like "_abbr-abbr", can you help? – WHITECOLOR Jul 12 '13 at 16:37
  • Ah, ok)) I just forget to put a dot when was testing abbrs with dash) Thank you again for the solution. – WHITECOLOR Jul 12 '13 at 16:46