This answer uses the following input sequence as an example. The expected output is all of the string except the last (
.
Input: ()(({}[]([{[()]}]{})))(
Output: ()(({}[]([{[()]}]{})))
Step 1 is to find the seeds in the string. A seed is a matched set of symbols: ()
, []
, or {}
. I've given each seed a numerical value to assist the reader in visualizing the seeds.
()(({}[]([{[()]}]{})))(
11 2233 44 55
Step 2 is to expand the seeds into sequences. A sequences is a nested set of symbols: e.g. [{[()]}]
. So starting from a seed, work outwards, verifying that the enclosing symbols are matched. The search ends at a mismatch, or at the beginning or end of the string. In the example, only seed 4 is enclosing by matching symbols, so only seed 4 is expanded.
()(({}[]([{[()]}]{})))(
11 2233 4444444455
Step 3 is to combine adjacent sequences. Note that there can be two or more adjacent sequences, but in the example there are two adjacent sequences in two places
()(({}[]([{[()]}]{})))(
11 2222 4444444444
Repeat step 2, treating the combined sequences as seeds. In this example, sequence 4 is enclosed by matching parentheses, so sequence 4 is expanded.
()(({}[]([{[()]}]{})))(
11 2222444444444444
Repeat step 3, combine sequences
()(({}[]([{[()]}]{})))(
11 2222222222222222
Repeat step 2, expand
()(({}[]([{[()]}]{})))(
1122222222222222222222
And combine one more time
()(({}[]([{[()]}]{})))(
1111111111111111111111
The algorithm ends when there's nothing left to expand, or combine. The longest sequence is the answer.
Implementation notes:
I think that you can achieve O(n)
by expanding/merging one sequence at a time. I would keep the list of sequences in a doubly-linked list (so removal is an O(1)
operation). Each sequence would be represented by a start
index, and an end
index.
Expanding a sequence involves checking the symbols at array[start-1]
and array[end+1]
, and then updating the start
/end
indexes as appropriate.
Merging involves checking the next and previous sequences in the linked list. If the sequences can be merged, then one sequence is updated to cover the full range, and the other is deleted.
Once an sequence is expanded/merged as much as possible, move to the next sequence in the list. As this new sequence is expanded/merged, it may eventually work its way back to the previous sequence. Hence, after initially creating a doubly-linked list of seeds, one pass through the linked list should be sufficient to expand/merge all of the sequences. Then a second pass through whatever remains of the linked list is needed to find the longest sequence.