0

I try to find and stock in HTML paragraphe all the word.

Actually, I have a function like this

p.html(function(index, oldHtml) {
return oldHtml.replace(/\b(\w+?)\b/g, '<span>$1</span>');
});

But it's only return word without accent. I test on regex101.com https://www.regex101.com/r/jS5gW6/1

Any idea ?

jcbaudot
  • 60
  • 7
  • 2
    you need to use unicodes.. – Avinash Raj Dec 23 '14 at 14:14
  • Do you need to find *only* words containing accents, or *all* words? Note that part of your problem is that `\w` does not recognize accented characters as 'word' characters, and another part is that `\b` internally uses the definition of `\w` to scan for word boundaries. So, even adding `é` and `ç` to a class with `\w` does not solve everything. – Jongware Dec 23 '14 at 14:19
  • @Jongware I need to find all words. – jcbaudot Dec 23 '14 at 14:53

1 Answers1

4

Use a character class:

oldHtml.replace(/([\wàâêëéèîïôûùüç]+)/gi, '<span>$1</span>');

Trying it:

var oldHtml = 'kjh À ùp géçhj ùù Çfg';
var res = oldHtml.replace(/([\wàâêëéèîïôûùüç]+)/gi, '<span>$1</span>');

gives

"<span>kjh</span> <span>À</span> <span>ùp</span> <span>géçhj</span> <span>ùù</span> 

Çfg"

Toto
  • 89,455
  • 62
  • 89
  • 125
  • How odd -- but thanks for clarifying! I assumed `i` would not 'ignore the case' *because* JS's regex does not support accented characters. I suppose it uses the general `toLowerCase()` underneath, then. – Jongware Dec 23 '14 at 15:22
  • thank you, it's work... Regex is complicate :-) – jcbaudot Dec 23 '14 at 15:26