2

How can I get contents of parenthesis in Racket? Contents may have more parenthesis. I tried:

(regexp-match #rx"((.*))" "(check)")

But the output has "(check)" three times rather than one:

'("(check)" "(check)" "(check)")

And I want only "check" and not "(check)".

Edit: for nested parenthesis, the inner block should be returned. Hence (a (1 2) c) should return "a (1 2) c".

rnso
  • 23,686
  • 25
  • 112
  • 234
  • 2
    What should the result be if the input string is `(a (b) c)`? Or even `(a (b c)`? Your question is a little underspecified. If you only want to handle the simple case, you just need to escape the parentheses: `#rx"(\\(.*\\))"`. – Alexis King Sep 13 '16 at 02:41
  • I have added edit and clarified in the question. The code (regexp-match #rx"(\\(.*\\))" "(check)") is returning '("(check)" "(check)") while I want "check" or '("check") only. – rnso Sep 13 '16 at 02:54
  • Remove the outer set of parentheses, then. `#rx"\(.*\)"` – Alexis King Sep 13 '16 at 02:56
  • Sorry, I meant to double-escape the parens. `#rx"\\(.*\\)"`. – Alexis King Sep 13 '16 at 02:58
  • Giving '("(check)") : check is still in parens. – rnso Sep 13 '16 at 02:59
  • Ah, right, you want just the bit inside. You want this, I think: `(second (regexp-match #rx"\\((.*)\\)" str))` – Alexis King Sep 13 '16 at 03:00
  • 2
    The example given above of `(a (1 2) c)` suggests to me that you want to match parens. This is not something that classical regexps can do (cf. pumping lemma). – John Clements Sep 13 '16 at 03:18
  • Yes, (second (regexp-match #rx"\\((.*)\\)" str)) works. – rnso Sep 13 '16 at 03:47

1 Answers1

1

Parentheses are capturing and not matching.. so #rx"((.*))" makes two captures of everything. Thus:

(regexp-match #rx"((.*))" "any text")
; ==> ("any text" "any text" "any text")

The resulting list has the first as the whole match, then the first set of acpturnig paren and then the ones inside those as second.. If you want to match parentheses you need to escape them:

(regexp-match #rx"\\((.*)\\)" "any text")
; ==> #f
(regexp-match #rx"\\((.*)\\)" "(a (1 2) c)")
; ==> ("(a (1 2) c)" "a (1 2) c")

Now you see that the first element is the whole match, since the match might start at any location in the search string and end where the match is largest. The second element is the only one capture.

This will fail if the string has additional sets of parentheses. eg.

(regexp-match #rx"\\((.*)\\)" "(1 2 3) (a (1 2) c)")
; ==> ("(1 2 3) (a (1 2) c)" "1 2 3) (a (1 2) c")

It's because the expression isn't nesting aware. To be aware of it you need recursive reguler expression like those in Perl with (?R) syntax and friends, but racket doesn't have this (yet???)

Sylwester
  • 47,942
  • 4
  • 47
  • 79