348

I have HTML code before and after the string:

name="some_text_0_some_text"

I would like to replace the 0 with something like : !NEW_ID!

So I made a simple regex :

.*name="\w+(\d+)\w+".*

But I don't see how to replace exclusively the captured block.

Is there a way to replace a captured result like ($1) with some other string ?

The result would be :

name="some_text_!NEW_ID!_some_text"
Trevor Reid
  • 3,310
  • 4
  • 27
  • 46
Nicolas Guillaume
  • 8,160
  • 6
  • 35
  • 44

8 Answers8

594

A solution is to add captures for the preceding and following text:

str.replace(/(.*name="\w+)(\d+)(\w+".*)/, "$1!NEW_ID!$3")

Explanation

The parentheses are used to create "groups", which then get assigned a base-1 index, accessible in a replace with a $.

  • the first word (\w+) is in a group, and becomes $1
  • the middle part (\d+) is the second group (but gets ignored in the replace)
  • the third group (\w+".*) becomes $3

So when you give the replace string of "$1!new_ID!$3", the $1 and $3 are replaced automagically with the first group and third group, allowing the 2nd group to be replaced with the new string, maintaining the text surrounding it.

Raine Revere
  • 30,985
  • 5
  • 40
  • 52
Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
  • 116
    Greetings from the future! Your solution looks really neat. Could you please explain your answer? – Polyducks Mar 29 '16 at 13:24
  • 42
    The parenthesis are used to create "groups", which then get assigned a base-1 index, accessible in a replace with a `$`, so the first word `(\w+)` is in a group, and becomes `$1`, the middle part `(\d+)` is the second group, (but gets ignored in the replace), and the third group is `$3`. So when you give the replace string of `"$1!new_ID!$3"`, the $1 and $3 are replaced automagically with the first group and third group, allowing the 2nd group to be replaced with the new string, maintaining the text surrounding it. – mix3d Mar 29 '16 at 22:14
  • 4
    That being said, while I understand HOW it works, I was hoping for a more elegant solution >.< Nevertheless, I can move forward with my code now! – mix3d Mar 29 '16 at 22:16
  • 11
    1) You don't even need to capture \d+ 2) Why do you say it's not elegant? Capturing is meant to keep stuff, not throw it away. What you want to keep is what is AROUND \d+, so it really makes sense (and is elegant enough) to capture these surrounding parts. – Sir4ur0n Aug 02 '16 at 09:08
  • 4
    Nice solution. What if we want to replace the capture groups using the capture group as a basis for the transformation? Is there an equally elegant solution to doing this? Currently I store the captured groups in a list, loop them, and replace the capture group with the transformed value at each iteration – sookie Aug 23 '17 at 14:33
  • 2
    bit simpler is `.replace(/\d+/g, "!NEW_ID!");` – Johannes Merz Feb 22 '19 at 14:40
  • It's 2023 and we are still greeting your solution :) – M Fuat Apr 05 '23 at 09:19
43

Now that Javascript has lookbehind (as of ES2018), on newer environments, you can avoid groups entirely in situations like these. Rather, lookbehind for what comes before the group you were capturing, and lookahead for what comes after, and replace with just !NEW_ID!:

const str = 'name="some_text_0_some_text"';
console.log(
  str.replace(/(?<=name="\w+)\d+(?=\w+")/, '!NEW_ID!')
);

With this method, the full match is only the part that needs to be replaced.

  • (?<=name="\w+) - Lookbehind for name=", followed by word characters (luckily, lookbehinds do not have to be fixed width in Javascript!)
  • \d+ - Match one or more digits - the only part of the pattern not in a lookaround, the only part of the string that will be in the resulting match
  • (?=\w+") - Lookahead for word characters followed by " `

Keep in mind that lookbehind is pretty new. It works in modern versions of V8 (including Chrome, Opera, and Node), but not in most other environments, at least not yet. So while you can reliably use lookbehind in Node and in your own browser (if it runs on a modern version of V8), it's not yet sufficiently supported by random clients (like on a public website).

Aaron Dunigan AtLee
  • 1,860
  • 7
  • 18
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • 1
    Just ran a quick timing test, and it's quite impressive how the input matters: https://jsfiddle.net/60neyop5/ – Kaiido Aug 12 '19 at 03:52
  • But if, for example I want to extract the number, multiple and "put it back", I'll have to group also `\d+`, right? – Mosh Feu Mar 29 '20 at 14:24
  • 1
    @MoshFeu Use a replacer function and use the whole match, the digits: replace the second parameter with `match => match * 2`. The digits are still the whole match, so there's no need for groups – CertainPerformance Mar 29 '20 at 22:15
  • 4
    thanks for sharing. browser support at ~75%, most notably missing from iOS Safari: https://caniuse.com/js-regexp-lookbehind – Crashalot Feb 07 '21 at 20:45
  • 1
    @Crashalot mid 2022 and still not supported by Safari (Mac or iOS) – Drenai Jul 30 '22 at 20:52
5

A little improvement to Matthew's answer could be a lookahead instead of the last capturing group:

.replace(/(\w+)(\d+)(?=\w+)/, "$1!NEW_ID!");

Or you could split on the decimal and join with your new id like this:

.split(/\d+/).join("!NEW_ID!");

Example/Benchmark here: https://codepen.io/jogai/full/oyNXBX

Jogai
  • 254
  • 1
  • 5
  • 19
3

With two capturing groups would have been also possible; I would have also included two dashes, as additional left and right boundaries, before and after the digits, and the modified expression would have looked like:

(.*name=".+_)\d+(_[^"]+".*)

const regex = /(.*name=".+_)\d+(_[^"]+".*)/g;
const str = `some_data_before name="some_text_0_some_text" and then some_data after`;
const subst = `$1!NEW_ID!$2`;
const result = str.replace(regex, subst);
console.log(result);

If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Emma
  • 27,428
  • 11
  • 44
  • 69
2

Know that you can use a transformer function as a second parameters if you need to transform and manipulate the capture groups ...

API

replace(
    regex,
    (matched, capture1, capture2, /*...,*/ capture_n, index, input_str) => transformed(/*...*/)
)
replace(
    regex: Regex,
    transformer: (matched: string, capture1: string, capture2: string, /*...,*/ capture_n: string, index: number, input_str: string) => string
) => string

The number of captures is relative to how much did you use in your regex. index and input_str are the last ones.

see the examples below and their output to get a better idea about what is each.

Doc ref:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#making_a_generic_replacer

Examples:

// To uses example
const propsArgs = args.map((arg) =>
  arg.slice(2).replace(/-(.)/g, (matched, captureGroup, index, input) => {
    return captureGroup.toUpperCase();
  })
);


// To uses example multiple captures groups
const propsArgs = args.map((arg) =>
  arg
    .slice(2)
    .replace(/-(.)(.)/g, (matched, capture1, capture2, index, input) => {
      return capture2.toUpperCase();
    })
);

// To uses example multiple captures groups args destructuring version
// args[0] matched, args[1] capture 1, ....., args[n] capture n, args[n+1] index, args[n+2] total string to replace.
const propsArgs = args.map((arg) =>
  arg.slice(2).replace(/-(.)(.)/g, (...args) => {
    return args[2].toUpperCase(); // capture 2
  })
);

// example for understanding
const propsArgs = args.map((arg) =>
  arg.slice(2).replace(/-(.)/g, (...args) => {
    console.log(args); // [ '-f', 'f', 6, 'config-file' ]
    return args[1].toUpperCase();
  })
);

// multiple capture groups and the args order
/**
 * matched string, then all the captures arg after another, then index, then total input string to replace
 */
const propsArgs = args.map((arg) =>
  arg
    .slice(2)
    .replace(
      /-(.)(.)(.)/g,
      (matched, capture1, capture2, capture3, index, input) => {
        // [ '-wat', 'w', 'a', 't', 3, 'log-watch-compilation' ]
        return capture1.toUpperCase();
      }
    )
);

The core example from above was to convert the command lines args to the javascript camel case equivalent.

Transforming this:

[
  '--filename',
  '--config-file',
  '--env-name',
  '--no-swcrc',
  '--ignore',
  '--only',
  '--watch',
  '--quiet',
  '--source-maps',
  '--source-map-target',
  '--source-file-name',
  '--source-root',
  '--out-file',
  '--out-dir',
  '--copy-files',
  '--include-dotfiles',
  '--config',
  '--sync',
  '--log-watch-compilation',
  '--extensions'
]

to

[
  'filename',            'configFile',
  'envName',             'noSwcrc',
  'ignore',              'only',
  'watch',               'quiet',
  'sourceMaps',          'sourceMapTarget',
  'sourceFileName',      'sourceRoot',
  'outFile',             'outDir',
  'copyFiles',           'includeDotfiles',
  'config',              'sync',
  'logWatchCompilation', 'extensions'
]
Mohamed Allal
  • 17,920
  • 5
  • 94
  • 97
1

Another simple solution is just to replace the value of the matched group with the new value:

name = 'some_text_0_some_text'
match = name.match(/\w+(\d+)\w+/)
console.log(name.replace(match[1], "!NEW_ID!")); 
// prints some_text_!NEW_ID!_some_text

This works if the value of the matched group does not occur anywhere else in the string.

match[1] is the value of the first group matched which is the string matched by the (\d+).

match[0] represents the values of the whole matched string.

Catalin
  • 366
  • 2
  • 8
0

A simplier option is to just capture the digits and replace them.

const name = 'preceding_text_0_following_text';
const matcher = /(\d+)/;

// Replace with whatever you would like
const newName = name.replace(matcher, 'NEW_STUFF');
console.log("Full replace", newName);

// Perform work on the match and replace using a function
// In this case increment it using an arrow function
const incrementedName = name.replace(matcher, (match) => ++match);
console.log("Increment", incrementedName);

Resources

CTS_AE
  • 12,987
  • 8
  • 62
  • 63
0
"some_text_0_some_text".replace(/(?=\w+)\d+(?=\w+)/, '!NEW_ID!')

Result is

some_text_!NEW_ID!_some_text

const regExp = /(?=\w+)\d+(?=\w+)/;
const newID = '!NEW_ID!';
const str = 'some_text_0_some_text';
const result = str.replace(regExp, newID);

console.log(result);

x(?=y) in JS RegExp

Matches "x" only if "x" is followed by "y". For example, /Jack(?=Sprat)/ matches "Jack" only if it is followed by "Sprat". /Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost". However, neither "Sprat" nor "Frost" is part of the match results.

details