0

Consider a set of strings as follows:

aa(bb(c)dd)
aeeff(bb(cd)eee)
(bb(c)dd)(bb(cd(eee

Even though the longest repeated non-overlapping substring is (bb(cd)eee (10 chars), the longest repeated substring with matched parentheses is (bb(c)ee) (9 chars).

I can easily find the longest non-overlapping repeated substring, but how can this be extended to matching parentheses?

user2398029
  • 6,699
  • 8
  • 48
  • 80

1 Answers1

2

As I understand you want the longest string within a single line that begins with "(" and ends with ")" and contains "balanced parentheses". In addition, the substring is to be a "repeated" substring, but that has not been defined, so I've disregarded that requirement.

balanced_parens? (below) is from my answer here.

def longest_balanced(str)
  str.lines.map { |s| longest_by_line(s.chomp) }.max_by(&:size)
end 

def longest_by_line(str)
  str.size.downto(1).find do |n|
    str.each_char.each_cons(n) do |a|
      s = a.join
      return s if (s =~ /\A\(.*\)\z/)  && balanced_parens?(s)
    end
  end
  nil
end

def balanced_parens?(str)
  pstr = str.gsub(/[^()]/,"")
  while pstr.gsub!("()", "")
  end
  pstr.empty?
end

str =<<_
aa(bb(c)dd)
aeeff(bb(cd)eee)
(bb(c)dd)(bb(cd(eee
_

longest_balanced str
  #=> "(bb(cd)eee)"
Community
  • 1
  • 1
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • This can be easily combined with the longest repeated substring function mentioned in the question, so I'm accepting even though it does not fully answer the question. – user2398029 Mar 04 '16 at 05:13
  • I somehow missed the link in the question. At that link there is [this link](http://rubyquiz.com/quiz153.html) to the original question. I'm still not clear on what constitutes the "repeated substring" and therefore its length. For "banana", is the repeated substring "an" (length 2) or "anan" (length 4). It makes a difference if we are comparing "banananana" and "booboo", for example. Moreover, I assume that "banana)" would be skipped but "(bananana)" would be a candidate. Correct? Once I know I'd be happy to revise my answer. As you say, that should be straightforward. – Cary Swoveland Mar 04 '16 at 06:08
  • For "banana", the longest substring that repeats itself without overlapping would be "an" (length 2). – user2398029 Mar 04 '16 at 14:50
  • So "repeating" means repeats one or more times in a row. "bananananan" contains two repeating substrings, "an" and "anan" and the LRS in "dogdogdog" and "catscats" is "cats". Correct? – Cary Swoveland Mar 04 '16 at 15:26
  • No need to repeat in a row. They can repeat in any order. But yes, these are the correct answers for the examples you gave. – user2398029 Mar 06 '16 at 00:22