2

I have a string like this:

this text (an another text) "and a text" and again a text

What I want is to only match first text. Other two text words either is inside of parentheses or in quotes.

this text (an another text) "and a text" and again a text
      ^ this is I do want to capture.     and this -> ^

How can I do this in a single regex match? I cannot find any solution for both cases in a single match.

text can be in any order.

Rohit.007
  • 3,414
  • 2
  • 21
  • 33
Dennis
  • 1,805
  • 3
  • 22
  • 41
  • 1
    Do you mean [(?:\(.*\)|".*")|(text)](https://regex101.com/r/WEdHLh/1/) ? – Paolo Jul 28 '18 at 14:06
  • Can they be in any order? – Eric Jul 28 '18 at 14:15
  • @Eric, yes they can be in any order. – Dennis Jul 28 '18 at 14:28
  • @UnbearableLightness let me try that. .. Update: It selects all text regardless of what. Tried with regexr app. – Dennis Jul 28 '18 at 14:28
  • Sorry, please click on the comment. I believe the escape of the brackets got lost due to formatting. – Paolo Jul 28 '18 at 14:36
  • Can catch most but if the text is at the end, and is kind of ugly. Maybe someone can finish it https://regexr.com/3t4id – Eric Jul 28 '18 at 14:48
  • @Pratha the pattern is `(?:\(.*\)|".*")|(text)` – Paolo Jul 28 '18 at 14:52
  • @UnbearableLightness you understand that "text" could be any text right? And that this could be in any order, right? – Eric Jul 28 '18 at 14:57
  • The question is not clear enough then. Also your solution below does not match `this -text-`., nor `'this-text'` nor `this@text@` – Paolo Jul 28 '18 at 15:05
  • 1
    There are answers for [matching outside parenthesis](https://stackoverflow.com/a/39565427) and [matching outside quotes](https://stackoverflow.com/a/26609791) available already. You can combine them: [`text(?![^(]*\))(?=[^"]*(?:"[^"]*"[^"]*)*$)`](https://regex101.com/r/DwKrnw/1) – bobble bubble Jul 28 '18 at 15:39

3 Answers3

2

var sTest = 'this text (an another text) "and a text"';
document.writeln(sTest.replace(/\([^)]*text[^)]*\)|"[^"]*text[^"]*"|text/g, (sMatch)=>{ return (sMatch === 'text' ? 'TEXT' : sMatch); }));

Using The Best Regex Trick. You define the things you do not want to include first, then the only one left is what you want.

Alien426
  • 1,097
  • 10
  • 12
0

Hopefully it matches all possible combination. See result here

 ((?!(?:( (\(|")[a-zA-Z ]+(\)|") )))?([\w ]+)(?=(?:( (\(|")[a-zA-Z ]+(\)|" ))))|([\w ]+$))

It could need some tuning but this is a good start I believe.

Eric
  • 9,870
  • 14
  • 66
  • 102
  • 1
    Don't know if this is a problem for the OP, but this regex would fail for something like `(an another text (text))` or `"some (text)"` – Wololo Jul 28 '18 at 17:54
0

I know you requested a single regex, but 2 passes is so much more extensible, easier to read and debug, and not less performant.

More bookend types can simply be added as well.

const input = `this text (an another text) "and a text" more text

  ( this is
    text
   )
   
  " and more
    text"
    
 trailing text (should be the third text match)

`

// remove all matching (...) and "..."
const sanitized = input.replace(/\([^)]*\)|"[^"]*"/g, '')
console.log(sanitized)

// now match
let rx = /text/g
let m
let x = 0
while (m = rx.exec(sanitized)) {
  console.log(m)
}
Steven Spungin
  • 27,002
  • 5
  • 88
  • 78
  • I have to pass this regex to another method. Due to this it is not possible filter as you stated above. And I have no chance to access that part of code either :( – Dennis Jul 28 '18 at 15:30
  • Could you pass a lamda to your other function instead of regex? – Steven Spungin Jul 28 '18 at 17:34