0

I am trying to title case all of the words in a string using a RegEx.

I am using the following code (derived from another post) right now, but I think there has to be something better.

My processing is as follows:

  1. Convert entire string to Lower Case
  2. Convert the 1st letter of a word following a boundary (non-word character) to Upper Case
  3. Handle any case where the string starts with "mc" following a boundary by converting the 3rd character to Upper Case

The code is:

    let text = "STEPHEN wells-o'shaugnessy mcdonald";
    let result = text.toLowerCase().replace(/\b\w/g,(c) => c.toUpperCase());
    result = result.replace(/\bmc\w/ig,(c) => c.charAt(0).toUpperCase() + c.charAt(1).toLowerCase() + c.charAt(2).toUpperCase());

    result is:  Stephen Well-O'Shaugnessy McDonald

I thought the following would work for names starting with "mc" but it does not, and I can't figure out why:

result = result.replace(/b(?:mc)\w/ig, (c) => c.toUpperCase());

My thought was that the "/(?:mc)" would match the characters "mc" following a boundary but ignore the match since it is a non-capturing match, globally and ignoring case

and the "\w" would match the next character

which would be converted to Upper Case in the (c) => c.toUpperCase()

Any help making this more concise and explaining why the last "replace" doesn't work would be appreciated.

Thanks,

Eric

isherwood
  • 58,414
  • 16
  • 114
  • 157
Eric B
  • 91
  • 2
  • 9
  • `b(?:mc)` matches a letter **b** followed by the letters **mc** – VLAZ Aug 23 '23 at 15:38
  • 2
    Also probably relevant: [Falsehoods Programmers Believe About Names](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/) + [Falsehoods Programmers Believe About Names – With Examples](https://shinesolutions.com/2018/01/08/falsehoods-programmers-believe-about-names-with-examples/) – VLAZ Aug 23 '23 at 15:39
  • 1
    *Handle any case where the string starts with "mc"* — Names with capitals other than at the front aren't limited to names starting with Mc. A good example, from the documentation VLAZ linked to, is **Mackenzie and MacKenzie** which are different names where the case of the K is important. It's generally better to **trust people to accurately tell you their names**. – Quentin Aug 23 '23 at 15:40
  • @VLAZ I suspect that's a typo for `\b(?:mc)` – Barmar Aug 23 '23 at 15:43
  • 1
    @Quentin in addition, not all characters after a boundary are capitals. "Vincent van Gogh" or "Eduardo da Silva". Not all that start with a capital continue with lowercase, either "Pope Gregory XIII" – VLAZ Aug 23 '23 at 15:45
  • Yes, the /b was a typo and should have been /\b. I understand that there are limitations to this, but am just trying to hit the majority of the cases that have been raised. The issue is that most of the information that is entered (not just names) is being entered in all lower case and that is "confusing" to users (I don't know why, but they swear it is). My biggest curiosity is why the ".replace("/\b(?:mc)\w/gi, (c) => c.toUpperCase());" does not work. – Eric B Aug 23 '23 at 16:06
  • "*My thought was that the "/(?:mc)" would match the characters "mc" following a boundary but ignore the match since it is a non-capturing match*" non-capturing groups are *only* not captured. As the name suggests. They still do match. Capturing vs non-capturing group only matters in that captured patterns are separate sub-results https://jsbin.com/zivasuyina/1/edit?js,console https://jsbin.com/gatuholofe/1/edit?js,console so "non-capturing" doesn't mean "discard this from the result". Just "don't add it as extra sub-result". See also https://stackoverflow.com/q/3512471 – VLAZ Aug 23 '23 at 16:29
  • VLAZ, thanks, I should have known/remembered that. It has been too long since I worked with RegEx. And thanks for the links. – Eric B Aug 23 '23 at 16:47
  • Actually, `McDonald` seems to be [correct](https://en.wikipedia.org/wiki/List_of_family_name_affixes). – Kosh Aug 23 '23 at 16:52

0 Answers0