0

Good day,

I'm currently working on a searchbar component made in Javascript. I'd like to find a way to save every string results for which one at least one word is prefixed by the value providing from an input.

Here is an example :

  • "This is an example" will match with these kind of input : "this", "is", "ex"... "

After some researches, I have found a simple way to do this, using \b metacharacter :

let _regex = new RegExp('\\b(inputValue)', 'gi'),
    _match = _regex.exec("My Full Sentence");

To be honest, it works really well as long as the sentences are not composed by accented characters. In fact, when a word begins with an accented character, the \b metacharacter doesn't work as intended.

For example :

  • "léviter" will properly match with "léviter"
  • "éviter" will oddly match with "léviter"
  • "éviter" will oddly not match with "éviter"

I have created a JSFiddle with more examples : https://jsfiddle.net/9L7vee46/46/

Thus, I'd like to know if a solution exists to have a correct behavior using \b metacharacter.

Thanks for you help.

klutt
  • 30,332
  • 17
  • 55
  • 95
Sackey
  • 392
  • 2
  • 11

1 Answers1

0

Normalise both your text and your search string to use the same type of accented character.

In Unicode, for historical reasons there are two different types of accented character for some characters: single codepoint and multiple codepoint. Your regular expression library is treating them differently, because in fact they are different. Before searching, pick one and replace the other with it for every instance of ambiguity (read: for every character that there are two encodings for).

In ES6, you can use "".normalize() to do this.

wizzwizz4
  • 6,140
  • 2
  • 26
  • 62