Capture words not followed by symbol

Question

I need to capture all (english) words except abbreviations whose pattern are:

"_any-word-symbols-including-dash."

(so there is underscore in the beginning and dot in the end an any letters and dash in the middle)

I tried smthing like this:

/\b([A-Za-z-^]+)\b[^\.]/g

but i seems that I don't understand how to work with negative matches.

UPDATE:

I need not just to match but wrap the words in some tags:

"a some words _abbr-abrr. a here" I should get:

<w>a</w> <w>some</w> <w>words</w> _abbr-abbr. <w>a</w> <w>here</w>

So I need to use replace with correct regex:

test.replace(/correct regex/, '<w>$1</w>')

@Anton Thanks for the link, fine tool, before I used another for testing. — WHITECOLOR, Jul 11 '13 at 13:15
@WHITECOLOR If you found a solution, you are encouraged to post it as an answer and accept it to mark your question as solved. — ajp15243, Jul 11 '13 at 13:25

score 2 · Accepted Answer · edited May 23 '17 at 11:56

2

So you can use:

/\b([^_\s]\w*(?!\.))\b/g

Unfortunately, there is no lookbehind in javascript, so you can't do similar trick with "not prefixed by _".

Example:

> a = "a some words _abbr. a here"
> a.replace(/\b([^_\s]\w*(?!\.))\b/g, "<w>$1</w>")
"<w>a</w> <w>some</w> <w>words</w> _abbr. <w>a</w> <w>here</w>"

Following your comment with -. Updated regex is:

/\b([^_\s\-][\w\-]*(?!\.))\b/g

> "abc _abc-abc. abc".replace(/\b([^_\s\-][\w\-]*(?!\.))\b/g, "<w>$1</w>")
"<w>abc</w> _abc-abc. <w>abc</w>"

edited May 23 '17 at 11:56

Community

answered Jul 12 '13 at 15:35

mishik

Thanks it seems to work beter, the problem i can not resolve with abbreviations that contain dash like "_abbr-abbr", can you help? – WHITECOLOR Jul 12 '13 at 16:37
Ah, ok)) I just forget to put a dot when was testing abbrs with dash) Thank you again for the solution. – WHITECOLOR Jul 12 '13 at 16:46

1 Answers1