3

I've been looking for a solution to my regular expression problem for a few hours and days.

Here is an example of a string and I try to capitalize the first letters:

test-de'Maëly dUIJSENS

With /\b[a-zA-Z]/g

I manage to isolate the first letter well, but letters with accents cause me problems, and my result always gives a capital letter after an accented letter:

Test-De'MaëLy Duijsens

My expected result is as follows:

Test-De'Maëly Duijsens

Here's my attempt:

function testcapital (){
  var xxx = capitalizePhrase("test-de'Maëly dUIJSENS")
}

function capitalizePhrase(phrase) {
  var accentedCharacters = "àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇߨøÅ寿œ";

  phrase  = phrase.toLowerCase()
  var reg = /\b[a-zA-Z]/g;
  function replace(firstLetters) {
    return firstLetters.toUpperCase();
  }
  capitalized = phrase.replace(reg, replace);
  return capitalized;
}

How can I prevent capitalization after the list of accented characters?

ggorlen
  • 44,755
  • 7
  • 76
  • 106
djconcept
  • 39
  • 1

1 Answers1

3

You can put the unicode characters into a character class that can be used in a negative lookbehind:

const capitalizePhrase = phrase => {
  const accentedChars = "àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇߨøÅ寿œ";
  const reg = new RegExp(`\\b(?<![${accentedChars}])([a-z])`, "g");
  return phrase.toLowerCase().replace(reg, m => m.toUpperCase());
};

console.log(capitalizePhrase("test-de'Maëly dUIJSENS"));
ggorlen
  • 44,755
  • 7
  • 76
  • 106