4

I'm using nom. I'd like to parse a string that's surrounded by parentheses, and allowing for additional nested parentheses within the string.

So (a + b) would parse as a + b, and ((a + b)) would parse as (a + b)

This works for the first case, but not the nested case:

pub fn parse_expr(input: &str) -> IResult<&str, &str> {
    // TODO: this will fail with nested parentheses, but `rest` doesn't seem to
    // be working.
    delimited(tag("("), take_until(")"), tag(")"))(input)
}

I tried using rest but this doesn't respect the final ):

pub fn parse_expr(input: &str) -> IResult<&str, &str> {
    delimited(tag("("), rest, tag(")"))(input)
}

Thanks!

Maximilian
  • 7,512
  • 3
  • 50
  • 63
  • 1
    Will `input` have other expressions before and/or after the parentheses? For example, can input look like this: "(a + b) + (c + d)"? – Adam Comer Jan 08 '22 at 14:38
  • `(a + b) + (c + d)` would fail. Alternatively fine if it just parses `a + b` and ` + (c + d)` is returned as a remainder. Only the first parenthesized expression would be parsed, though. – Maximilian Jan 09 '22 at 05:05

1 Answers1

1

I found a reference to this in the nom issue log: https://github.com/Geal/nom/issues/1253

I'm using this function, from parse_hyperlinks — basically a hand-written parser for this https://docs.rs/parse-hyperlinks/0.23.3/src/parse_hyperlinks/lib.rs.html#41 :

pub fn take_until_unbalanced(
    opening_bracket: char,
    closing_bracket: char,
) -> impl Fn(&str) -> IResult<&str, &str> {
    move |i: &str| {
        let mut index = 0;
        let mut bracket_counter = 0;
        while let Some(n) = &i[index..].find(&[opening_bracket, closing_bracket, '\\'][..]) {
            index += n;
            let mut it = i[index..].chars();
            match it.next().unwrap_or_default() {
                c if c == '\\' => {
                    // Skip the escape char `\`.
                    index += '\\'.len_utf8();
                    // Skip also the following char.
                    let c = it.next().unwrap_or_default();
                    index += c.len_utf8();
                }
                c if c == opening_bracket => {
                    bracket_counter += 1;
                    index += opening_bracket.len_utf8();
                }
                c if c == closing_bracket => {
                    // Closing bracket.
                    bracket_counter -= 1;
                    index += closing_bracket.len_utf8();
                }
                // Can not happen.
                _ => unreachable!(),
            };
            // We found the unmatched closing bracket.
            if bracket_counter == -1 {
                // We do not consume it.
                index -= closing_bracket.len_utf8();
                return Ok((&i[index..], &i[0..index]));
            };
        }

        if bracket_counter == 0 {
            Ok(("", i))
        } else {
            Err(Err::Error(Error::from_error_kind(i, ErrorKind::TakeUntil)))
        }
    }
}
Maximilian
  • 7,512
  • 3
  • 50
  • 63