-1

Is there a way in the JS regular expression API to rewrite an input such that only matched groups are retained (in order), but which works for an arbitrary regex (including ones that have no capturing groups)?

E.g. for the regex /abc([D-F]+)gh([I-K]+)/ and the input abcFEEDghKIKI, I'd want the output "FEEDKIKI", since those are the captured parts.

However, the input regex could have also been /([a-zA-Z]+)/ which for the same input would just return the original text.

And if the regex was /([0-9]+)/ then the empty string would be returned.

The regular expressions will be generated by a tool, but there are lots of them and I don't want a special case code for different numbers of capturing groups.

Ideally, I'd also be able to access the length of the match in the input somehow too.

I assume there's some way to do it via a callback passed to replace() (or similar), but it's not clear how/if I can get at the indices of the captured groups (or whether that's even necessary in JS).

Essentially I am trying to replicate the sort of thing you can do via the MatchResult API in Java, where groups can be iterated over and the start/end indices of captured groups can be found.

Thanks in advance!

Dmitriy Popov
  • 2,150
  • 3
  • 25
  • 34
David
  • 171
  • 10
  • Just concatenate the capture groups as replacement, what's the actual problem here? Can you please show what actual code you already have? (Also note that if you write "I assume there's some way to do it via a callback to `replace()`" then you can expect "so look it up. Did you [read the docs](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#specifying_a_function_as_the_replacement) for `replace()`?" in response. Which you should. Because you're correct) – Mike 'Pomax' Kamermans Aug 11 '23 at 17:03
  • I did look up what `replace()` does but could find no way to get an arbitrary list of replacements (and your snarky tone is not appreciated). How would I, without knowing how many groups were matched, access them? – David Aug 13 '23 at 00:32
  • 1
    No snark, whether it's appreciated or not: you were correct, but the docs cover this if you read them. The `The function has the following signature` part tells us exactly how to get all the capture groups, whether we know how many there are or not, because the function's second through last-3rd argument are the capture groups. So instead of using a function with named arguments, we use the standard JS `arguments` variable instead, and slice it up to only the capture groups (e.g. `Array.from(arguments).slice(1, arguments.length - 3)`). – Mike 'Pomax' Kamermans Aug 13 '23 at 06:12
  • You see, that was the bit I was missing. I wasn't aware of the "standard" arguments variable (I'm not super familiar with that sort of thing). The docs "covered it" only if I was also aware of this mechanism. Thanks for explaining. Perhaps next time you could consider this sort of possibility in the answer before jumping to "did you even read the docs" ? – David Aug 17 '23 at 16:22
  • And maybe next time you can answer questions by updating your post instead of going "I didn't like your tone", because nowhere in your post do you mention that you (re)read the documentation (plenty of folks _never bother to_ and go straight for SO) which is part of [writing a good post](/help/how-to-ask). Instead of assuming snark, assume that you forgot details that need to be covered for folks to know what type of answer you need, because if you'd said "I read the docs and capture groups are args [...] but I can't figure out how to work with that", none of this would've happened – Mike 'Pomax' Kamermans Aug 17 '23 at 16:29
  • Actually I tried it with a lambda and it doesn't work. I suspect because: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions ""Arrow functions don't have their own bindings to this, arguments, or super, and should not be used as methods."" Since I've now got this all working fine with `exec()` anyway, I think I'll delete this question (unless there's actually a neater way to do it with lambdas and `replace()`). – David Aug 17 '23 at 16:35
  • JS doesn't have lambdas (that's a Python thing), do you mean and anonymous function (with the `function` keyword) or an arrow function (with the `=>` syntax)? Edit: ah, arrow function. Yeah, don't use arrow functions _unless_ you need to preserve the declare-time `this`. Usually it doesn't matter all that much but in this case it matters a lot: use a normal anonymous function instead. – Mike 'Pomax' Kamermans Aug 17 '23 at 16:37
  • Out of interest, do you have citation for "don't use arrow functions unless you need to preserve `this`"? It always helps to know why a new, shiny feature is something I shouldn't be using in general (I'd naturally assumed arrow functions were an improvement for code, not a hazard). – David Aug 17 '23 at 16:42
  • Just the docs for arrow functions paired with general good programming patterns: arrow functions were added for when code _needs_ to preserve the declare time `this` context. There's a lot of times you need that, but there's also a lot of times when you don't, and writing a normal function in those cases almost always makes you go "wait why am I declaring this inside another function/inside a react component/etc. so that it'll get rebuilt over and over and over when this is clearly something that I can define in broader scope _once_ and reuse?" – Mike 'Pomax' Kamermans Aug 17 '23 at 16:47
  • Interesting observation. My JS isn't all that complex, but I'll bear it in mind. I hadn't appreciated the difference (and I'm coming to JS via TypeScript in reality where bare functions aren't as encouraged - static class methods seem to be preferred there and since they are verbose I was using arrow functions for simple stuff). – David Aug 17 '23 at 16:51

1 Answers1

-1

Having found: How do you access the matched groups in a JavaScript regular expression? It looks (at least initially) like using exec() will do what I want.

David
  • 171
  • 10