1

Let's assume that two regexp-strings are given:

boost::regex r1 = "[AB]";
boost::regex r2 = "[ABCDEF]";

Is there a simple way to determine with boost::regex whether r1 is subset of r2?

In the example above r1 is subset of r2.

boost::regex_match works with a string and a regex parameter. But something which work with two regular expressions would be nice.

This question is relates only to c++ and the boost::regex library.

1 Answers1

0

Convert the regexes to a DFA graph, g1 and g2.

Define g1' and g2' as the same graphs with the accepting states inverted.

Define a = g1 x g2' and b = g1' x g2, where you keep track of both sets of states for the input. The accepting states of a and b are those that are accepting in both the source-product graphs.

Strings accepted by a are those that r1 accepts and r2 does not.

Strings accepted by b are those that r2 accepts and r1 does not.

r1 is a subset of r2 if and only if every string r1 accepts is also accepted by r2.

So simply prove that a accepts no strings to prove r1 is a subset of r2.

If you want strict subset, also show that b accepts at least one string.

I am unaware of a way to do any of this easily with boost. I don't know if these steps qualify as "easy". I suspect not, because this problem is PSPACE-complete.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524