2

I'm trying to use regular expressions to match certain groups of strings which correspond to functions. Right now it looks like this:

(Spreadsheet.[^)\)]+\))

Where it finds the variable Spreadsheet which has the function as an attribute. The expression keeps going until it gets to the end parenthesis. For simple functions such as

Spreadsheet.ADD(1,2)

the regular expression will work fine.

However, if I try to do any sort of nesting, the expression does not work because it will stop at the inside parenthesis instead of going to the last parenthesis.

Spreadsheet.ADD(Spreadsheet.ADD(1, 2), 3)

Thus, the ", 3)" isn't identified and ends being ignored. Of course, due to the way my code processes it, this unusual string ends up causing an error.

Does anyone with more knowledge of regular expressions know how it could be changed such that it will stop only when it is at the last parenthesis and not the first?

Thanks.

Nick Felker
  • 11,536
  • 1
  • 21
  • 35
  • 1
    It isn't a duplicate, at least not from what I can tell of the question you posted. I don't want to match my expression multiple times. I want to match my expression from beginning parenthesis to ending parenthesis regardless of how many parenthesis exist inside. – Nick Felker Feb 24 '14 at 03:28
  • What about `Spreadsheet.ADD(1, 2) + Spreadsheet.ADD(3, 4)`? – falsetru Feb 24 '14 at 03:32
  • Part of the input is specified by the user. Additionally, the functions aren't just arithmetic. Nesting is something that I'd like to have. – Nick Felker Feb 24 '14 at 04:25

1 Answers1

2

Assuming that you only want to match functions in the form that you state in the question. If you want to match any type of function (including operators, nested comments, etc) then what you are wanting is going to be difficult with regex, see here. Anyway, to match the last bracket you can use:

(Spreadsheet\..+\))

This will match

Spreadsheet.ADD(1,2)

Spreadsheet.ADD(Spreadsheet.ADD(1, 2), 3)

Spreadsheet.ADD(Spreadsheet.ADD(1, 2), 3)foo

(foo not part of the match)

The reason that your regex did not match the full string is because it will stop when it finds a character that is not a ) which is the first ). Also, as an aside Spreadsheet. will match Spreadsheeta, Spreadsheetb, Spreadsheetc. To match a dot you need \..

In my regex .+) will include the last bracket because + is greedy, so it will get the longest match it can. As an aside you would specify a non-greedy match using +?

Community
  • 1
  • 1
acarlon
  • 16,764
  • 7
  • 75
  • 94
  • Thank you very much. This expression is just what I need. I already have a way of parsing the expression for proper evaluation, but given a specific user input, I had no idea how to work with nesting. The \. is good advice, as Spreadsheetb would create an error, although that isn't provided through user input. (The user types "ADD(1,2)" and the rest of the text is added in the background. I did know why the regex failed, but I didn't know how I could adjust it to get proper nesting. – Nick Felker Feb 24 '14 at 04:28