0

I am trying to convert an expression such as [[a], [b]] into list(c(a), c(b)) (basically a java dictionary into R list). As a first step, I would like to convert each inner expression [a] into an equivalent c(a). According to How to replace square brackets with curly brackets using R's regex?, I can use a nice regular expression "\\[(.*?)\\]" or also \\[([^]]*)\\].

This will work when there is only one [] parenthesis, but not multiple ones like [[ as it will capture the first, resulting in "c([a), c(b])" instead of "[c(a), c(b)]". How can I make sure I am only matching the inner parenthesis in a call that contains multiple [[], []]?

vec <- c("[a]", "[[a], [b]]")
gsub("\\[(.*?)\\]", "c(\\1)", vec)
#> [1] "c(a)"         "c([a), c(b])"
gsub("\\[([^]]*)\\]", "c(\\1)", vec)
#> [1] "c(a)"         "c([a), c(b)]"

Created on 2021-02-15 by the reprex package (v0.3.0)

Matifou
  • 7,968
  • 3
  • 47
  • 52
  • (Oops, I thought I only had a "vote" to reopen. @WiktorStribiżew, I was hoping for a more democratic solution than just over-riding your closure, though I still think it's the right move. Thoughts?) – r2evans Feb 15 '21 at 22:03

1 Answers1

2

While Remove any text inside square brackets in r suggests how to deal with the regex itself, it doesn't address the "nested" component of the problem.

You can run it multiple times until there are no more changes.

vec <- c("[a]", "[[a], [b]]")
(vec2 <- gsub("\\[([^][]*)\\]", "c(\\1)", vec))
# [1] "c(a)"         "[c(a), c(b)]"
(vec3 <- gsub("\\[([^][]*)\\]", "c(\\1)", vec2))
# [1] "c(a)"          "c(c(a), c(b))"

The change is to disallow both opening [ and closing ] brackets in the regex, which should only match the inner-most (no brackets).

It should be feasible to nest this in a while loop that exits as soon as no change is detected.

r2evans
  • 141,215
  • 6
  • 77
  • 149