108

I'd like to know how to replace a capture group with its uppercase in JavaScript. Here's a simplified version of what I've tried so far that's not working:

> a="foobar"
'foobar'
> a.replace( /(f)/, "$1".toUpperCase() )
'foobar'
> a.replace( /(f)/, String.prototype.toUpperCase.apply("$1") )
'foobar'

Would you explain what's wrong with this code?

ErikE
  • 48,881
  • 23
  • 151
  • 196
Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
  • 1
    @Erik don't remove a component of a question. I want to know why my code is failing too. – Evan Carroll May 26 '11 at 17:59
  • 2
    Evan, I thought I was being respectful of your question. I only removed things that seemed unnecessary. Since you gave the code you were trying, and it obviously wasn't working, then people implicitly knew you needed an explanation of why without you having to say so (and awkwardly). Just trying to help! :) – ErikE May 26 '11 at 19:17
  • 1
    Evan, is that better? I don't mean to annoy. If you rollback again I won't edit again, but could you at least keep the title & tag edits in place? – ErikE May 26 '11 at 19:20
  • Technically, I'm not using Javascript at all, I'm using v8 (ECMAScript). But, I imagine most people searching this will be looking for JavaScript, so I'm good with it. – Evan Carroll May 26 '11 at 20:41
  • 1
    Feel free to add tags back if you think they belong. – ErikE May 27 '11 at 01:17
  • @EvenCarrol and for posterity: It's entirely okay to call v8's implementation of ECMAscript "JavaScript" even if it's technically a misnomer. Making the distinction for every case where it wasn't Netscape or Mozilla's implementation would just take too much time and there's really no pretty way to pronounce "ECMAscript" with less than 5 clunky syllables. Not to mention you can technically support the ECMA spec but still have something fairly different from JavaScript or JScript or whatever Chrome/V8 calls their version so it does assert that we're at least in the same family of ECMAscript. – Erik Reppen Jun 23 '13 at 00:54
  • I disagree. It's confusing and allowing it to continue makes the matter more confusing. Javascript does all kinds of cool stuff Ecmascript doesn't. Ja-Va, Eck-Ma -- not sure the syllable difference there. Confusing the two is akin calling C, a subset of C++, C++. – Evan Carroll Jul 01 '13 at 23:30
  • 1
    A confusing aspect of this question for me was that the question is a special case where the capture group is also the entire matched expression. If this is not the case, the answers have unexpected results, because they treat the whole match. – Colin Feb 14 '18 at 17:58

7 Answers7

193

You can pass a function to replace.

var r = a.replace(/(f)/, function(v) { return v.toUpperCase(); });

Explanation

a.replace( /(f)/, "$1".toUpperCase())

In this example you pass a string to the replace function. Since you are using the special replace syntax ($N grabs the Nth capture) you are simply giving the same value. The toUpperCase is actually deceiving because you are only making the replace string upper case (Which is somewhat pointless because the $ and one 1 characters have no upper case so the return value will still be "$1").

a.replace( /(f)/, String.prototype.toUpperCase.apply("$1"))

Believe it or not the semantics of this expression are exactly the same.

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
ChaosPandion
  • 77,506
  • 18
  • 119
  • 157
  • @Evan Carroll: Please see my answer. – Kijewski May 26 '11 at 17:58
  • 4
    Ah, I see what you mean, I'm upercasing "\$1". Not the result of the voodoo that replace will do that is apparently substituting `$1` for the first capture group. – Evan Carroll May 26 '11 at 18:04
  • @EvanCarroll for a thorough explanation of why your initial code didn't work and how to get it to work, see my answer below. – Joshua Piccari May 11 '14 at 04:07
  • Inconsequential overall, but the capture group isn't needed `f` `(f)`, since `v` is referencing group 0. A note people reading this. In a quick benchmark, each capture group slowed the regex down by 5%. – Regular Jo Jun 19 '21 at 00:19
  • See https://stackoverflow.com/a/58951201/1835470 if you have more complex regex and/or capturing groups – jave.web Aug 21 '23 at 09:48
19

I know I'm late to the party but here is a shorter method that is more along the lines of your initial attempts.

a.replace('f', String.call.bind(a.toUpperCase));

So where did you go wrong and what is this new voodoo?

Problem 1

As stated before, you were attempting to pass the results of a called method as the second parameter of String.prototype.replace(), when instead you ought to be passing a reference to a function

Solution 1

That's easy enough to solve. Simply removing the parameters and parentheses will give us a reference rather than executing the function.

a.replace('f', String.prototype.toUpperCase.apply)

Problem 2

If you attempt to run the code now you will get an error stating that undefined is not a function and therefore cannot be called. This is because String.prototype.toUpperCase.apply is actually a reference to Function.prototype.apply() via JavaScript's prototypical inheritance. So what we are actually doing looks more like this

a.replace('f', Function.prototype.apply)

Which is obviously not what we have intended. How does it know to run Function.prototype.apply() on String.prototype.toUpperCase()?

Solution 2

Using Function.prototype.bind() we can create a copy of Function.prototype.call with its context specifically set to String.prototype.toUpperCase. We now have the following

a.replace('f', Function.prototype.apply.bind(String.prototype.toUpperCase))

Problem 3

The last issue is that String.prototype.replace() will pass several arguments to its replacement function. However, Function.prototype.apply() expects the second parameter to be an array but instead gets either a string or number (depending on if you use capture groups or not). This would cause an invalid argument list error.

Solution 3

Luckily, we can simply substitute in Function.prototype.call() (which accepts any number of arguments, none of which have type restrictions) for Function.prototype.apply(). We have now arrived at working code!

a.replace(/f/, Function.prototype.call.bind(String.prototype.toUpperCase))

Shedding bytes!

Nobody wants to type prototype a bunch of times. Instead we'll leverage the fact that we have objects that reference the same methods via inheritance. The String constructor, being a function, inherits from Function's prototype. This means that we can substitute in String.call for Function.prototype.call (actually we can use Date.call to save even more bytes but that's less semantic).

We can also leverage our variable 'a' since it's prototype includes a reference to String.prototype.toUpperCase we can swap that out with a.toUpperCase. It is the combination of the 3 solutions above and these byte saving measures that is how we get the code at the top of this post.

Joshua Piccari
  • 315
  • 2
  • 8
  • 5
    You saved 8 characters, while obscuring the code in such a way as to require a page of explanation over the more obvious solution. I'm not convinced this is a win. – Lawrence Dol May 31 '16 at 23:25
  • 1
    Intellectually, this is a great solution, in that it surfaces/teaches a thing or 2 about javascript functions. But I'm with Lawrence that in practice it's too obscure to actually be used. Still cool. – meetamit Jan 05 '17 at 16:34
  • 1
    Yep, don't wanna see code like this in production, but it was really fun when I found out for myself you can do that in JS :D – Alex l. Jul 28 '22 at 08:13
19

Why don't we just look up the definition?

If we write:

a.replace(/(f)/, x => x.toUpperCase())

we might as well just say:

a.replace('f','F')

Worse, I suspect nobody realises that their examples have been working only because they were capturing the whole regex with parentheses. If you look at the definition, the first parameter passed to the replacer function is actually the whole matched pattern and not the pattern you captured with parentheses:

function replacer(match, p1, p2, p3, offset, string)

If you want to use the arrow function notation:

a.replace(/xxx(yyy)zzz/, (match, p1) => p1.toUpperCase()
Bernhard Wagner
  • 1,681
  • 12
  • 15
  • 1
    IMHO this is the simplest and most elegant solution. – Almir Campos Mar 13 '21 at 15:02
  • 1
    This is the only right answer explaining which arg is what and how does the passing to the function actually work, since it's ***not*** that `(f)` is some function call... and explains how to work with more complex regexes and more capturing groups :-) – jave.web Aug 21 '23 at 09:50
12

Old post but it worth to extend @ChaosPandion answer for other use cases with more restricted RegEx. E.g. ensure the (f) or capturing group surround with a specific format /z(f)oo/:

> a="foobazfoobar"
'foobazfoobar'
> a.replace(/z(f)oo/, function($0,$1) {return $0.replace($1, $1.toUpperCase());})
'foobazFoobar'
// Improve the RegEx so `(f)` will only get replaced when it begins with a dot or new line, etc.

I just want to highlight the two parameters of function makes finding a specific format and replacing a capturing group within the format possible.

CallMeLaNN
  • 8,328
  • 7
  • 59
  • 74
  • Thank you! The previous posts seem to have answered the problem of _why_ the OP's code didn't work while completely skipping what seemed the real point to me--replacing a match group! – Auspex Feb 19 '19 at 11:57
  • I think you have an error in your replace function, but check me on this. I think it should be `return $0.replace($0, $1.toUpperCase())`, where `$0` is the first argument – Mike Sep 27 '19 at 23:39
  • It was a simple string to string replace. so f to F is correct. – CallMeLaNN Sep 29 '19 at 05:54
  • This is really helpful if you're trying to replace something in brackets! – Diode Dan Aug 12 '20 at 02:46
3

SOLUTION

a.replace(/(f)/,(m,g)=>g.toUpperCase())  

for replace all grup occurrences use /(f)/g regexp. The problem in your code: String.prototype.toUpperCase.apply("$1") and "$1".toUpperCase() gives "$1" (try in console by yourself) - so it not change anything and in fact you call twice a.replace( /(f)/, "$1") (which also change nothing).

let a= "foobar";
let b= a.replace(/(f)/,(m,g)=>g.toUpperCase());
let c= a.replace(/(o)/g,(m,g)=>g.toUpperCase());

console.log("/(f)/ ", b);
console.log("/(o)/g", c);
Kamil Kiełczewski
  • 85,173
  • 29
  • 368
  • 345
  • What are `m` and `g`? I guess `g` is `group`? – mikemaccana Feb 24 '23 at 11:29
  • 1
    @mikemaccana `m` means `matched substring` - it is argument for standarized [replacer function](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#specifying_a_function_as_the_replacement:~:text=Copy%20to%20Clipboard-,Specifying%20a%20function%20as%20the%20replacement,-You%20can%20specify) – Kamil Kiełczewski Feb 24 '23 at 22:25
0

Given a dictionary (object, in this case, a Map) of property, values, and using .bind() as described at answers

const regex = /([A-z0-9]+)/;
const dictionary = new Map([["hello", 123]]); 
let str = "hello";
str = str.replace(regex, dictionary.get.bind(dictionary));

console.log(str);

Using a JavaScript plain object and with a function defined to get return matched property value of the object, or original string if no match is found

const regex = /([A-z0-9]+)/;
const dictionary = {
  "hello": 123,
  [Symbol("dictionary")](prop) {
    return this[prop] || prop
  }
};
let str = "hello";
str = str.replace(regex, dictionary[Object.getOwnPropertySymbols(dictionary)[0]].bind(dictionary));

console.log(str);
guest271314
  • 1
  • 15
  • 104
  • 177
0

In the case of string conversion from CamelCase to bash_case (ie: for filenames), use a callback with ternary operator.

The captured group selected with a regexp () in the first (left) replace arg is sent to the second (right) arg that is a callback function. x and y give the captured string (don't know why 2 times!) and index (the third one) gives the index of the beginning of the captured group in the reference string. Therefor a ternary operator can be used not to place _ at first occurence.

let str = 'MyStringName';
str = str.replace(/([^a-z0-9])/g, (x,y,index) => {
      return index != 0 ? '_' + x.toLowerCase() : x.toLowerCase();
});
console.log(str);