1

i have a basic string and would like to get only specific charaters between the brackets

Base string: This is a test string [more or less]

regex: to capture all r's and e's works just fine.

(r|e) 

=> This is a test string [more or less]

Now i want to use the following regex and group it with my regex to give only r's and e's between the brackets, but unfortunately this doesn't work:

\[(r|e)\]

Expected result should be : more or less

can someone explain?

edit: the problem is very similar to this one: Regular Expression to find a string included between two characters while EXCLUDING the delimiters

but with the difference, that i don't want to get the whole string between the brackets.

Follow up problem

base string = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed'

I need a regex for finding the non-ascii characters äöü in order to replace them but only in the link:...] substring which starts with the word link: and ends with a ] char.

The result string will look like this:

result string = 'this is a link:/en/test/apfel/ohr[MyLink_with_aou] BREAK äöü is now allowed again'

The regex /[äöü]+(?=[^\]\[]*])/g from the solution in the comments only delivers the äöü chars between the two brackets.

I know that there is a forward lookahead with a char list in the regex, but i wonder why this one does not work:

/link:([äöü]+(?=[^\]\[]*])/

thanks

Community
  • 1
  • 1
mg.vmuc
  • 33
  • 5
  • Define "doesn't work." – T.J. Crowder Oct 07 '16 at 06:02
  • *"Expected result should be : mo**re** o**r** l**e**ss"* It's not clear what you mean by that. Do you mean the result of a single invocation of `exec`? A loop? – T.J. Crowder Oct 07 '16 at 06:03
  • it simply does not match. I'm using this regex to substitute all r's and e's for the given string. The first one works fine for the whole string. But the second one, which should only substitute between the brackets does not work. Checking with regex101.com delivers:'Your regular expression does not match the subject string.' – mg.vmuc Oct 07 '16 at 06:12
  • What do you need to do with the matches? Collect? Replace? Wrap with a tag (if the input is HTML)? – Wiktor Stribiżew Oct 07 '16 at 06:24
  • 2
    Since there is no PCRE-like `(*SKIP)(?!)` nor infinite width lookbehind support in JS regex, you can use a usual hack - [`/[re]+(?=[^\][]*])/g`](https://regex101.com/r/4xyqVr/1). If you need more precision, match with `/\[[^\][]*]/g` and then do what you need to the `e`s and `r`s only inside the matches. – Wiktor Stribiżew Oct 07 '16 at 06:37
  • @WiktorStribiżew The match for `[re]+` should probably be lazy, since I'm guessing the OP wants just every character. – Qwerp-Derp Oct 07 '16 at 06:38
  • thanks @WiktorStribiżew , the solution works for me. follow up question: how can i look between a string in the regex? For example, i want something like this: `/[re]+(?=[^\] more ]*])/g` which should result in: " o**r** l**e**ss" ? – mg.vmuc Oct 07 '16 at 08:35
  • 1
    You expect too much from a regex in JS. Don't. Without any details on what you need to do, further chatting makes no sense. – Wiktor Stribiżew Oct 07 '16 at 08:41
  • @WiktorStribiżew : thank you, i have added the detailed problem description as follow up problem in the question – mg.vmuc Oct 07 '16 at 09:15

1 Answers1

0

You can use the following solution: match all between link: and ], and replace your characters only inside the matched substrings inside a replace callback method:

var hashmap = {"ä":"a", "ö":"o", "ü":"u"};
var s = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed';
var res = s.replace(/\blink:[^\]]*/g, function(m) {  // m = link:/en/test/äpfel/öhr[MyLink_with_äöü]
  return m.replace(/[äöü]/g, function(n) { // n = ä, then ö, then ü, 
    return hashmap[n];                     // each time replaced with the hashmap value
  });
});
console.log(res);

Pattern details:

  • \b - a leading word boundary
  • link: - whole word link with a : after it
  • [^\]]* - zero or more chars other than ] (a [^...] is a negated character class that matches any char/char range(s) but the ones defined inside it).

Also, see Efficiently replace all accented characters in a string?

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563