10

Given two regular expressions, is it possible to detect whether there is any possible string that matches them both?

For example, given regexes A and ., I can see that string "A" matches them both. That's a simple case.

My question is for the broader case -- given any two valid regexes, would it be possible to definitively say whether there is any possible string that would match both regexes? Assume that there is no sample set of input strings to test. All I have are the regexes. I don't necessarily need to produce matching strings -- I just need to determine that there are possible strings that match both.

Will accept discussions for any of the common regex specifications -- .NET, Java, PERL, sed, grep, etc.

Michael Gunter
  • 12,528
  • 1
  • 24
  • 58

2 Answers2

5

Basically, you want to test if the intersection of two RegExps is non-empty. Since intersection - just like complement - is a potentially expensive operation (it requires determinization of the NFA), it is not implemented in many RegExp implementations. One exception I know of is the BRICS Automaton Library, which allows enabling the intersection operator &.

To test the property in question, you could use the BRICS (Java) library like this:

RegExp re = new RegExp("(.) & (a)", RegExp.INTERSECTION); // Parse RegExp
Automaton a = re.toAutomaton(); // convert RegExp to automaton

if(a.isEmpty()) { // Test if intersection is empty
  System.out.println("Intersection is empty!");
}
else {
  // Print the shortest accepted string
  System.out.println("Intersection is non-empty, example: " + a.getShortestExample(true));
}
misberner
  • 3,488
  • 20
  • 17
0

Yes, it's possible theoretically.

But it basically comes down to try all possible options and see which matches both regexes. But it's more a theoretical computer science question, with modern day regular expressions in programming languages this would be a problem in NP (http://en.wikipedia.org/wiki/NP_(complexity))

If you're talking more about the formal language theory definition of a regular language, than I would say it should be possible by converting both regexes to a DFA and walk through both simultaneously to see what would match.

Wolph
  • 78,177
  • 11
  • 137
  • 148