7

I'd like to take a user-input regular expression and determine whether or not it will match any string, i.e. would it "reduce" to .+ or .*?

I suspect that since this exists, that my question will reduce to the halting problem, but I'd really like to be wrong about that.

Alex S
  • 25,241
  • 18
  • 52
  • 63
  • Kleene Star or `*` is technically infinite in a theoretical aspect but that module you posted tests for up to 32767 repetitions with `*`. – squiguy May 24 '13 at 04:26
  • How about checking if the input is not blank/empty/null/zero length? Since you have no other criteria, as long as the user input _something_ it should match your requirements. – Burhan Khalid May 24 '13 at 04:28
  • 1
    @Burhan Khalid You misunderstand the question. He wants to test if an arbitrary regex would match any string or not. – Patashu May 24 '13 at 04:30
  • Oh I see (said the blind man). – Burhan Khalid May 24 '13 at 04:31
  • What do you mean you "reduce" to? What is the input string, user input too? Why can you not just test in [JavaScript](https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Regular_Expressions)? [This online tester](http://regexhero.net/tester/) will highlight matches? Is this what you are after? – jammykam May 24 '13 at 04:48
  • 1
    Might it be easier to try and find a regex that will match nothing, or prove that none such regexes can exist? If some exist, they are surely not as numerous as the ones that do match *some* string. @jammykam The user will be inputting a regex, and OP wants to make sure the user's regex will actually match at least one of any possible string. – ajp15243 May 24 '13 at 04:59
  • 1
    This is just a special case of the question linked above where one language is `.*`. – hammar May 24 '13 at 05:04
  • @hammar you are correct, my question is indeed a special case of the one you linked. Thank you! – Alex S May 24 '13 at 07:38

1 Answers1

0

I don't think what you want is similar to the Halting problem since the grammar of regular expression. Considering the alphabet and the language recognized by your automaton is finite, you can still use a dummy algorithm that would try every world of your language and test if the regular expression is able to recognize it or not.

In practice, this method has an awful complexity but you don't have any "undefined" state you would have in Halting problem since the number of inputs is enumerable.

I actually don't know if a better version of this dummy algorithm exists, but i hope i answered about your question on similarity to the Halting problem.

ibi0tux
  • 2,481
  • 4
  • 28
  • 49
  • The reason I thought it might turn into the halting problem was I had guessed that there would be no better way to analyze a regular expression than using techniques like the one in the module I linked. If that had been the case, the question would become 'is this list of unique strings infinite?' Testing against known words isn't sufficient. – Alex S May 24 '13 at 09:03