0

The two regexes:

regex_1 = /^A+\S{2}$/
regex_2 = /^AB+\d{1}$/

match the following ten strings:

AB0
AB1
AB2
AB3
AB4
AB5
AB6
AB7
AB8
AB9

Is there a way to find strings that match two regular expressions that are given?


I have a regex, and it will be sliced to many sub-regexes as follows.

Example 1:

original_regex = /^A+\S{2}$/
sub_regex1 = /^AB+\S{1}$/
sub_regex2 = /^AC+\S{1}$/

Example 2:

original_regex = /^598+\S{5}$/
sub_regex1 = /^598A+\S{4}$/
sub_regex2 = /^598AB+\S{3}$/

I want to know whether there are any strings that match all sub-regexes.

I am thinking to convert the regex to a string and compare the minimal-length prefix and the minimal-length suffix like this:

regex_1 = "/^A+\S{2}$/"
regex_2 = "/^AB+\d{1}$/"
regex_3 = "/^AC+\d{1}$/"
minimal_prefix = "/^A"

Any regex string that contains minimal_prefix has a string that matches all sub-regexes. I am figuring out whether this is correct or not.

sawa
  • 165,429
  • 45
  • 277
  • 381
Donald Chiang
  • 167
  • 1
  • 1
  • 14
  • maybe related [generate string for regex pattern in ruby](https://stackoverflow.com/questions/21084246/generate-string-for-regex-pattern-in-ruby) – Nahuel Fouilleul Feb 06 '18 at 10:12

1 Answers1

1

Is there a quick way in general? No. What are "all the strings" that match these pairs of regular expressions:

  • /.*/ and /\d*/? (There are infinitely many!)
  • /\A\d{10}\z/ and /\A[0-8]{10}\z/? (There are 3,486,784,401!)
  • /\w+\d{2,4}@?([[:punct:]]|\w){2}/ and /(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)/ (I haven't even tried to work this out; my point is: you could provide arbitrarily complicated input!!)

...But for simple scenarios, such as what you've actually asked, it would be feasible to use this ruby gem:

/^A+\S{2}$/.examples(max_group_results: 999) & /^AB+\d{1}$/.examples(max_group_results: 999)
=> ["AB0", "AB1", "AB2", "AB3", "AB4", "AB5", "AB6", "AB7", "AB8", "AB9"]
Tom Lord
  • 27,404
  • 4
  • 50
  • 77
  • Thanks! I will try this gem. I will update my question to clarify my situation – Donald Chiang Feb 06 '18 at 10:26
  • Regarding your edit: *"Any regex string that contains `minimal_prefix` has a string that matches all sub-regexes. I am figuring out whether this is correct or not."* --- Again, this works in **simple** scenarios, but *not in general!!!* Here's a trivial example: `/a|b+/` vs `/b+/` -- by your logic, there are *no* strings that match both, when in reality there are clearly infinitely many (`"b"`, `"bb"`, `"bbb"`, `"bbbb"`, ...). – Tom Lord Feb 06 '18 at 11:02
  • 1
    So, again: If you de-scope the problem to simple scenarios like this, then you could write an efficient solution quite easily. But it's **not** easy to solve the problem *in general*. – Tom Lord Feb 06 '18 at 11:03