15

Is it posible to check if a string contains a substring with locale support?

'Ábc'.contains('A') should be true.

Javascript now has the string.prototype.localeCompare() for string comparison with locale support but I cannot see the localeContains() counterpart.

David Casillas
  • 1,801
  • 1
  • 29
  • 57
  • But they are not the same character so why would any standard JS tool should evaluate them as equal? You might consider setting up a hash table. [This](http://stackoverflow.com/a/287173/4543207) can be a nice start for you. – Redu Sep 17 '16 at 15:10
  • 3
    It's true that they are not equal but it is a needed function to do any decent string filtering. Nobody wants to input an 'e' in a table filter and not get the 'José' value listed. Once the locale support is available and we now that 'a', 'A' and 'Á' can be sorted together its no a big step to give the option to consider then the same in a substring search. The link is fine but is a lost battle to manually consider all chars in any language. – David Casillas Sep 17 '16 at 15:18
  • 1
    I guess you either have to do it manually or use a library like [Javascript Unicode Library](https://github.com/reyesr/javascript-unicode) which can assist you to get Latin equivalents of your strings with accented characters as seen [here](https://github.com/reyesr/javascript-unicode#user-content-example-2) – Redu Sep 17 '16 at 15:31

4 Answers4

6

You can do this:

String.prototype.contains = function contains(charToCheck) {
  return this.split('').some(char => char.localeCompare(charToCheck, 'en', {sensitivity: 'base'}) === 0)
}

console.log('Ábc'.contains('A')) // true
console.log('Ábc'.contains('B')) // true
console.log('Ábc'.contains('b')) //true
console.log('Ábc'.contains('u')) //false
console.log('coté'.contains('e')) //true

Documentation on localCompare. Sensitivity base means:

"base": Only strings that differ in base letters compare as unequal. Examples: a ≠ b, a = á, a = A.

Georgy
  • 2,410
  • 3
  • 21
  • 35
6

There is a faster alternative to contains() with locale check on string

It seems that to strip diacritics and then natively compare the strings is much faster: on my architecture almost 10 times faster than @chickens or @dag0310 solution, check yours here. Returns true on empty string check to be consistent with String.includes.

String.prototype.localeContains = function(sub) {
  if(sub==="") return true;
  if(!sub || !this.length) return false;
  sub = ""+sub;
  if(sub.length>this.length) return false;
  let ascii = s => s.normalize("NFD").replace(/[\u0300-\u036f]/g, "").toLowerCase();
  return ascii(this).includes(ascii(sub));
}

var str = "142 Rozmočených Kříd";
console.log(str.localeContains("kŘi"));
console.log(str.localeContains(42));
console.log(str.localeContains(""));
console.log(str.localeContains(false));
Jan Turoň
  • 31,451
  • 23
  • 125
  • 169
3

If you are looking for more than one character here is a not very efficient but working option:

const localeContains = (a,b) => !!a.split('').filter((v,i)=>a.slice(i,b.length).localeCompare(b, "en", { sensitivity: 'base' })===0).length
a = "RESERVE ME";
b = "réservé";

console.log(localeContains(a,b));
chickens
  • 19,976
  • 6
  • 58
  • 55
1

chickens' answer does not work if the searched string is not at the beginning of the main string.

Use this package instead: https://www.npmjs.com/package/locale-includes

localeIncludes('RESERVE ME', 'éservé', {usage: 'search', sensitivity: 'base'});
// true

To make it even nicer to use as a string prototype function:

String.prototype.localeIncludes = function(str) {
  return localeIncludes(this, str, {usage: 'search', sensitivity: 'base'});
};

'RESERVE ME'.localeIncludes('éservé');
// true
dag0310
  • 13
  • 4